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Streptococcus pneumoniae Polynucleotides and Sequences 

FIELD OF THE INVENTION 

5 The present invention relates to the field of molecular biology. In 

particular, it relates to, among other things, nucleotide sequences of Streptococcus 
pneumoniae, contigs, ORFs, fragments, probes, primers and related 
polynucleotides thereof, peptides and polypeptides encoded by the sequences, and 
uses of the polynucleotides and sequences thereof, such as in fermentation, 
10 polypeptide production, assays and pharmaceutical development, among others. 

BACKGROUND OF THE INVENTION 

Streptococcus pneumoniae has been one of the most extensively studied 

15 microorganisms since its first isolation in 1881. It was the object of many 
investigations that led to important scientific discoveries. In 1928, Griffith 
observed that when heat-killed encapsulated pneumococci and live strains 
constitutively lacking any capsule were concomitantly injected into mice, the 
nonencapsulated could be converted into encapsulated pneumococci with the same 

20 capsular type as the heat-killed strain. Years later, the nature of this "transforming 
principle," or carrier of genetic information, was shown to be DNA. (Avery, O/T., 
et aU Exp. Med., 79:137-157 (1944)). 

In spite of the vast number of publications on S. pneumoniae many 
questions about its virulence are still unanswered, and this pathogen remains a 

25 major causative agent of serious human disease, especially community-acquired 
pneumonia. (Johnston, R.B., et aL, Rev. Infect. Dis. 73(Suppl. 6):S509-517 
( 1991)). In addition, in developing countries, the pneumococcus is responsible for 
the death of a large number of children under the age of 5 years from pneumococcal 
pneumonia. The incidence of pneumococcal disease is highest in infants under 2 

30 years of age and in people over 60 years of age. Pneumococci are the second most 
frequent cause (after Haemophilus influenzae type b) of bacterial meningitis and 
otitis media in children. With the recent introduction of conjugate vaccines for H. 
influenzae type b, pneumococcal meningitis is likely to become increasingly 
prominent. S. pneumoniae is the most important etiologic agent of community- 
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acquired pneumonia in adults and is the second most common cause of bacterial 
meningitis behind Neisseria meningitidis. 

The antibiotic generally prescribed to treat S. pneumoniae is 
benzylpenicillin, although resistance to this and to other antibiotics is found 
5 occasionally. Pneumococcal resistance to penicillin results from mutations in its 
penicillin-binding proteins. In uncomplicated pneumococcal pneumonia caused by 
a sensitive strain, treatment with penicillin is usually successful unless started too 
late. Erythromycin or clindamycin can be used to treat pneumonia in patients 
hypersensitive to penicillin, but resistant strains to these drugs exist. Broad 
10 spectrum antibiotics (e.g., the tetracyclines) may also be effective, although 
tetracycline-resistant strains are not rare. In spite of the availability of antibiotics, 
the mortality of pneumococcal bacteremia in the last four decades has remained 
stable between 25 and 29%. (Gillespie, S.H., et «/., 7. Med. Microbiol 28:237- 
248 (1989). 

15 S. pneumoniae is carried in the upper respiratory tract by many healthy 

individuals. It has been suggested that attachment of pneumococci is mediated by a 
disaccharide receptor on fibronectin, present on human pharyngeal epithelial cells. 
(Anderson, B.J„ etalj. Immunol 742:2464-2468 (1989). The mechanisms by 
which pneumococci translocate from the nasopharynx to the lung, thereby causing 

20 pneumonia, or migrate to the blood, giving rise to bacteremia or septicemia, are 
poorly understood. (Johnston, R.B., et aL Rev. Infect Dis. 7J(Suppl. 6):S509- 
517 (1991), 

Various proteins have been suggested to be involved in the pathogenicity of 
5. pneumoniae, however, only a few of them have actually been confirmed as 

25 virulence factors. Pneumococci produce an IgAl protease that might interfere with 
host defense at mucosal surfaces. (Kornfield, S.J., etal,Rev. Inf. Dis. 3:521- 
534 (1981). S. pneumoniae also produces neuraminidase, an enzyme that may 
facilitate attachment to epithelial cells by cleaving sialic acid from the host 
glycolipids and gangliosides. Partially purified neuraminidase was observed to 

30 induce meningitis-like symptoms in mice; however, the reliability of this finding 
has been questioned because the neuraminidase preparations used were probably 
contaminated with cell wall products. Other pneumococcal proteins besides 
neuraminidase are involved in the adhesion of pneumococci to epithelial and 
endothelial cells. These pneumococcal proteins have as yet not been identified. 

35 Recently, Cundell eu al, reported that peptide permeases can modulate 



WO 98/18931 



3 



PCTYUS97/19588 



pneumococcal adherence to epithelial and endothelial cells. It was, however, 
unclear whether these permeases function directly, as adhesions or whether they 
enhance adherence by modulating the expression of pneumococcal adhesions. 
(DeVelasco, E.A., etal. Micro. Rev. 59:591-603 (1995). A better understanding 
5 of the virulence factors determining its pathogenicity will need to be developed to 
cope with the devastating effects of pneumococcal disease in humans. 

Ironically, despite the prominent role of S. pneumoniae in the discovery of 
DNA, little is known about the molecular genetics of the organism. The S. 
pneumoniae genome consists of one circular, covalently closed, double-stranded 

10 DNA and a collection of so-called variable accessory elements, such as prophages, 
plasmids, transposons and the like. Most physical characteristics and almost all of 
the genes of 5. pneumoniae are unknown. Among the few that have been 
identified, most have not been physically mapped or characterized in detail. Only a 
few genes of this organism have been sequenced. (See, for instance current 

15 versions of GENBANK and other nucleic acid databases, and references that relate 
to the genome of 5. pneumoniae such as those set out elsewhere herein.) 

It is clear that the etiology of diseases mediated or exacerbated by 5. 
pneumoniae, infection involves the programmed expression of 5. pneumoniae 
genes, and that characterizing the genes and their patterns of expression would add 

20 dramatically to our understanding of the organism and its host interactions. 
Knowledge of S. pneumoniae genes and genomic organization would improve our 
understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases. Moreover, 
characterized genes and genomic fragments of S. pneumoniae would provide 

15 reagents for, among other things, detecting, characterizing and controlling 5. 
pneumoniae infections. There is a need to characterize the genome of 5. 
pneumoniae and for polynucleotides of this organism. 
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SUMMARY. OF T HE INVENTION 

The present invention is based on the sequencing of fragments of the 
Streptococcus pneumoniae genome. The primary nucleotide sequences which were 
generated are provided in SEQ ID NOS: 1-391. 

The present invention provides the nucleotide sequence of several hundred 
contigs of the Streptococcus pneumoniae genome, which are listed in tables below 
and set out in the Sequence Listing submitted herewith, and representative 
fragments thereof, in a form which can be readily used, analyzed, and interpreted 
by a skilled artisan. In one embodiment, the present invention is provided as 
contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID NOS: I -39 1 . 

The present invention further provides nucleotide sequences which are at 
least 95% identical to the nucleotide sequences of SEQ ID NOS: 1-391. 

The nucleotide sequence of SEQ ID NOS:l-391, a representative fragment 
thereof, or a nucleotide sequence which is at least 95% identical to the nucleotide 
sequence of SEQ ID NOS: 1-391 may be provided in a variety of mediums to 
facilitate its use. In one application of this embodiment, the sequences of the 
present invention are recorded on computer readable media. Such media includes, 
but is not limited to: magnetic storage media, such as floppy discs, hard disc 
storage medium, and magnetic tape; optical storage media such as CD-ROM; 
electrical storage media such as RAM and ROM; and hybrids of these categories 
such as magnetic/optical storage media. 

The present invention further provides systems, particularly computer- 
based systems which contain the sequence information herein described stored in a 
data storage means. Such systems are designed to identify commercially important 
fragments of the Streptococcus pneumoniae genome. 

Another embodiment of the present invention is directed to fragments of the 
Streptococcus pneumoniae genome having particular structural or functional 
attributes. Such fragments of the Streptococcus pneumoniae genome of the present 
invention include, but are not limited to, fragments which encode peptides, 
hereinafter referred to as open reading frames or ORFs. fragments which modulate 
the expression of an operably linked ORF, hereinafter referred to as expression 
modulating fragments or EMFs, and fragments which can be used to diagnose the 
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presence of Streptococcus pneumoniae in a sample, hereinafter referred to as 
diagnostic fragments or DFs. 

Each of the ORFs in fragments of the Streptococcus pneumoniae genome 
disclosed in Tables 1-3, and the EMFs found 5' to the ORFs, can be used in 
5 numerous ways as polynucleotide reagents. For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the 
presence of a specific microbe in a sample, to selectively control gene expression in 
a host and in the production of polypeptides, such as polypeptides encoded by 
ORFs of the present invention, particular those polypeptides that have a 

10 pharmacological activity. 

The present invention further includes recombinant constructs comprising 
one or more fragments of the Streptococcus pneumoniae genome of the present 
invention. The recombinant constructs of the present invention comprise vectors, 
such as a plasmid or viral vector, into which a fragment of the Streptococcus 

15 pneumoniae has been inserted. 

The present invention further provides host cells containing any of the 
isolated fragments of the Streptococcus pneumoniae genome of the present 
invention. The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a 

20 bacterial cell. 

The present invention is further directed to isolated polypeptides and 
proteins encoded by ORFs of the present invention. A variety of methods, well 
known to those of skill in the art, routinely may be utilized to obtain any of the 
polypeptides and proteins of the present invention. For instance, polypeptides and 

25 proteins of the present invention having relatively short, simple amino acid 
sequences readily can be synthesized using commercially available automated 
peptide synthesizers. Polypeptides and proteins of the present invention also may 
be purified from bacterial cells which naturally produce the protein. Yet another 
alternative is to purify polypeptide and proteins of the present invention from cells 

30 which have been altered to express them. 

The invention further provides methods of obtaining homologs of the 
fragments of the Streptococcus pneumoniae genome of the present invention and 
homologs of the proteins encoded by the ORFs of the present invention. 
Specifically, by using the nucleotide and amino acid sequences disclosed herein as 
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a probe or as primers, and techniques such as PCR cloning and colony/plaque 
hybridization, one skilled in the art can obtain homologs. 

The invention further provides antibodies which selectively bind 
polypeptides and proteins of the present invention. Such antibodies include both 
monoclonal and polyclonal antibodies; 

The invention further provides hybridomas which produce the above- 
described antibodies. A hybridoma is an immortalized cell line which is capable of 
secreting a specific monoclonal antibody. 

The present invention further provides methods of identifying test samples 
derived from cells which express one of the ORFs of the present invention, or a 
homolog thereof. Such methods comprise incubating a test sample with one or 
more of the antibodies of the present invention, or one or more of the DFs of the 
present invention, under conditions which allow a skilled artisan to determine if the 
sample contains the ORF or product produced therefrom. 

In another embodiment of the present invention, kits are provided which 
contain the necessary reagents to carry out the above-described assays. 

Specifically, the invention provides a compartmentalized kit to receive, in 
close confinement, one or more containers which comprises: (a) a first container 
comprising one of the antibodies, or one of the DFs of the present invention; and 
(b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of bound antibodies or hybridized 
DFs. 

Using the isolated proteins of the present invention, the present invention 
further provides methods of obtaining and identifying agents capable of binding to 
a polypeptide or protein encoded by one of the ORFs of the present invention. 
Specifically, such agents include, as further described below, antibodies, peptides, 
carbohydrates, pharmaceutical agents and the like. Such methods comprise steps 
of: (a) contacting an agent with an isolated protein encoded by one of the ORFs of 
the present invention; and (b) determining whether the agent binds to said protein. 

The present genomic sequences of Streptococcus pneumoniae will be of 
great value to all laboratories working with this organism and for a variety of 
commercial purposes. Many fragments of the Streptococcus pneumoniae genome 
will be immediately identified by similarity searches against GenBank or protein 
databases and will be of immediate value to Streptococcus pneumoniae researchers 
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and for immediate commercial value for the production of proteins or to control 
gene expression. 

The methodology and technology for elucidating extensive genomic 
sequences of bacterial and other genomes has and will greatly enhance the ability to 

5 analyze and understand chromosomal organization. In particular, sequenced 
contigs and genomes will provide the models for developing tools for the analysis 
of chromosome structure and function, including the ability to identify genes within 
large segments of genomic DNA, the structure, position, and spacing of regulatory 
elements, the identification of genes with potential industrial applications, and the 

1 0 ability to do comparative genomic and molecular phylogeny . 

DESCRIPTION OF THE FlttllRES 

FIGURE 1 is a block diagram of a computer system (102) that can be 
15 used to implement computer-based systems of present invention. 

FIGURE 2 is a schematic diagram depicting the data flow and computer 
programs used to collect, assemble, edit and annotate the contigs of the 
Streptococcus pneumoniae genome of the present invention. Both Macintosh and 

20 Unix platforms are used to handle the AB 373 and 377 sequence data files, largely 
as described in Kerlavage et a/., Proceedings of the Twenty-Sixth Annual Hawaii 
International Conference on System Sciences, 585, IEEE Computer Society Press, 
Washington D.C. (1993). Factura (AB) is a Macintosh program designed for 
automatic vector sequence removal and end-trimming of sequence files. The 

25 program Loadis runs on a Macintosh platform and parses the feature data extracted 
from the sequence files by Factura to the Unix based Streptococcus pneumoniae 
relational database. Assembly of contigs (and whole genome sequences) is 
accomplished by retrieving a specific set of sequence files and their associated 
features using Extrseq, a Unix utility for retrieving sequences from an SQL 

M) database. The resulting sequence file is processed by seq_filter to trim portions of 
the sequences with more than 2% ambiguous nucleotides. The sequence files were 
assembled using TIGR Assembler, an assembly engine designed at The Institute 
for Genomic Research ( TIGR ) for rapid and accurate assembly of thousands of 
sequence fragments. The collection of contigs generated by the assembly step is 

15 loaded into the database with the lassie program. Identification of open reading 
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frames (ORFs) is accomplished by processing contigs with zorf or GenMark. The 
ORFs are searched against S. pneumoniae sequences from GenBank and against all 
protein sequences using the BLASTN and BLASTP programs, described in 
Altschul et ai, J. Mol. Biol. 215: 403-410 (1990)). Results of the ORF 
determination and similarity searching steps were loaded into the database. As 
described below, some results of the determination and the searches are set out in 
Tables 1-3. 

DETAILED DESCRIPTI ON OF ILLUSTRATIVE FMBOniMFlVJTS 



The present invention is based on the sequencing of fragments of the 
Streptococcus pneumoniae genome and analysis of the sequences. The primary 
nucleotide sequences generated by sequencing the fragments are provided in SEQ 
ID NOS: 1-391, (As used herein, the ••primary sequence" refers to the nucleotide 
1 5 sequence represented by the IUP AC nomenclature system. ) 

In addition to the aforementioned Streptococcus pneumoniae polynucleotide 
and polynucleotide sequences, the present invention provides the nucleotide 
sequences of SEQ ID NOS: 1-391, or representative fragments thereof, in a form 
which can be readily used, analyzed, and interpreted by a skilled artisan. 

As used herein, a "representative fragment of the nucleotide sequence 
depicted in SEQ ID NOS:l-391" refers to any portion of the SEQ ID NOS:l-391 
which is not presently represented within a publicly available database. Preferred 
representative fragments of the present invention are Streptococcus pneumoniae 
open reading frames ( ORFs ), expression modulating fragment ( EMFs ) and 
25 fragments which can be used to diagnose the presence of Streptococcus 
pneumoniae in sample ( DFs ). A non-limiting identification of preferred 
representative fragments is provided in Tables 1-3. As discussed in detail below, 
the information provided in SEQ ID NOS: 1-391 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled 
30 in the art to clone and sequence all "representative fragments" of interest, including 
open reading frames encoding a large variety of Streptococcus pneumoniae 
proteins. 

While the presently disclosed sequences of SEQ ID NOS: 1-391 are highly 
accurate, sequencing techniques are not perfect and, in relatively rare instances, 
35 further investigation of a fragment or sequence of the invention may reveal a 
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nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID 
NOS:l-391. However, once the present invention is made available (i.e., once the 
information in SEQ ID NOS: 1-391 and Tables 1-3 has been made available), 
resolving a rare sequencing error in SEQ ID NOS: 1-391 will be well within the 
skill of the art. The present disclosure makes available sufficient sequence 
information to allow any of the described contigs or portions thereof to be obtained 
readily by straightforward application of routine techniques. Further sequencing of 
such polynucleotide may proceed in like manner using manual and automated 
sequencing methods which are employed ubiquitous in the art. Nucleotide 
sequence editing software is publicly available. For example, Applied Biosystem's 
(AB) AutoAssembler can be used as an aid during visual inspection of nucleotide 
sequences. By employing such routine techniques potential errors readily may be 
identified and the correct sequence then may be ascertained by targeting further 
sequencing effort, also of a routine nature, to the region containing the potential 
15 error. 

Even if all of the very rare sequencing errors in SEQ ID NOS: 1-3 91 were 
corrected, the resulting nucleotide sequences would still be at least 95% identical, 
nearly all would be at least 99% identical, and the great majority would be at least 
99.9% identical to the nucleotide sequences of SEQ ID NOS: 1-391. 

20 As discussed elsewhere herein, polynucleotides of the present invention 

readily may be obtained by routine application of well known and standard 
procedures for cloning and sequencing DNA. Detailed methods for obtaining 
libraries and for sequencing are provided below, for instance. A wide variety of 
Streptococcus pneumoniae strains that can be used to prepare S. pneumoniae 

25 genomic DNA for cloning and for obtaining polynucleotides of the present 
invention are available to the public from recognized depository institutions, such 
as the American Type Culture Collection ( ATCC ). While the present invention is 
enabled by the sequences and other information herein disclosed, the S. 
pneumoniae strain that provided the DNA of the present Sequence Listing, Strain 

30 7/87 14.8.91, has been deposited in the ATCC, as a convenience to those of skill 
in the art. As a further convenience, a library of S. pneumoniae genomic DNA, 
derived from the same strain, also has been deposited in the ATCC. The S. 
pneumoniae strain was deposited on October 10, 1996, and was given Deposit No. 
55840, and the cDNA library was deposited on October 11, 1996 and was given 

35 Deposit No. 97755. The genomic fragments in the library are 15 to 20 kb 
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fragments generated by partial Sau3Al digestion and they are inserted into the 
BamHI site in the well-known lambda-derived vector lambda DASH II (Stratagene, 
La Jolla, CA). The provision of the deposits is not a waiver of any rights of the 
inventors or their assignees in the present subject matter. 
5 The nucleotide sequences of the genomes from different strains of 

Streptococcus pneumoniae differ somewhat. However, the nucleotide sequences 
of the genomes of all Streptococcus pneumoniae strains will be at least 95% 
identical, in corresponding part, to the nucleotide sequences provided in SEQ ID 
NOS:l-391. Nearly all will be at least 99% identical and the great majority will be 
10 99.9% identical. 

Thus, the present invention further provides nucleotide sequences which 
are at least 95%, preferably 99% and most preferably 99.9% identical to the 
nucleotide sequences of SEQ ID NOS: 1-391, in a form which can be readily used, 
analyzed and interpreted by the skilled artisan. 

15 Methods for determining whether a nucleotide sequence is at least 95%, at 

least 99% or at least 99.9% identical to the nucleotide sequences of SEQ ID 
NOS: 1-391 are routine and readily available to the skilled artisan. For example, the 
well known fasta algorithm described in Pearson and Lipman, Proc. Natl. Acad. 
Sci. USA 85: 2444 (1988) can be used to generate the percent identity of nucleotide 

20 sequences. The BLASTN program also can be used to generate an identity score 
of polynucleotides compared to one another. 



25 



30 



35 



COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS: 1-39 1, a representative 
fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% 
and most preferably at least 99.9% identical to a polynucleotide sequence of SEQ 
ID NOS: 1 -39 1 may be "provided" in a variety of mediums to facilitate use thereof. 
As used herein, provided refers to a manufacture, other than an isolated nucleic 
acid molecule, which contains a nucleotide sequence of the present invention; i.e., 
a nucleotide sequence provided in SEQ ID NOS:l-391, a representative fragmem 
thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most 
preferably at least 99.9% identical to a polynucleotide of SEQ ID NOS:l-391. 
Such a manufacture provides a large portion of the Streptococcus pneumoniae 
genome and parts thereof (e.g., a Streptococcus pneumoniae open reading frame 
(ORF)) in a form which allows a skilled artisan to examine the manufacture using 
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means not directly applicable to examining the Streptococcus pneumoniae genome 
or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 
5 readable media" refers to any medium which can be read and accessed direcdy by a 
computer. Such media include, but are not limited to: magnetic storage media, 
such as floppy discs, hard disc storage medium, and magnetic tape; optical storage 
media such as CD- ROM; electrical storage media such as RAM and ROM; and 
hybrids of these categories, such as magnetic/optical storage media. A skilled 
10 artisan can readily appreciate how any of the presently known computer readable 
mediums can be used to create a manufacture comprising computer readable 
medium having recorded thereon a nucleotide sequence of the present invention. 
Likewise, it will be clear to those of skill how additional computer readable media 
that may be developed also can be used to create analogous manufactures having 

1 5 recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on 
computer readable medium. A skilled artisan can readily adopt any of the presently 
know methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present 

20 invention. A variety of data storage structures are available to a skilled artisan 
for creating a computer readable medium having recorded thereon a nucleotide 
sequence of the present invention. The choice of the data storage structure will 
generally be based on the means chosen to access the stored information. In 
addition, a variety of data processor programs and formats can be used to store the 

25 nucleotide sequence information of the present invention on computer readable 
medium. The sequence information can be represented in a word processing text 
file, formatted in commercially- available software such as WordPerfect and 
Microsoft Word, or represented in the form of an ASCII file, stored in a database 
application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily 

30 adapt any number of data-processor structuring formats (e.g., text file or database) 
in order to obtain computer readable medium having recorded thereon the 
nucleotide sequence information of the present invention. 

Computer software is publicly available which allows a skilled artisan to 
access sequence information provided in a computer readable medium. Thus, by 

35 providing in computer readable form the nucleotide sequences of SEQ ID NOS:l- 
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391, a representative fragment thereof, or a nucleotide sequence at least 95%, 
preferably at least 99% and most preferably at least 99.9% identical to a sequence 
of SEQ ID NOS: 1 -39 1 the present invention enables the skilled artisan routinely to 
access the provided sequence information for a wide variety of purposes. 
5 The examples which follow demonstrate how software which implements 

the BLAST (Altschul et al, J. Mol. Biol. 2/5:403-410 (1990)) and BLAZE 
(Brutlag etal, Comp. Chem. 77:203-207 (1993)) search algorithms on a Sybase 
system was used to identify open reading frames (ORFs) within the Streptococcus 
pneumoniae genome which contain homology to ORFs or proteins from both 
10 Streptococcus pneumoniae and from other organisms. Among the ORFs discussed 
herein are protein encoding fragments of the Streptococcus pneumoniae genome 
useful in producing commercially important proteins, such as enzymes used in 
fermentation reactions and in the production of commercially useful metabolites. 

The present invention further provides systems, particularly computer- 
based systems, which contain the sequence information described herein. Such 
systems are designed to identify, among other things, commercially important 
fragments of the Streptococcus pneumoniae genome. 

As used herein, "a computer-based system" refers to the hardware means, 
software means, and data storage means used to analyze the nucleotide sequence 
information of the present invention. The niinimum hardware means of the 
computer-based systems of the present invention comprises a central processing 
unit (CPU), input means, output means, and data storage means. A skilled artisan 
can readily appreciate that any one of the currently available computer-based 
systems are suitable for use in the present invention. 

As stated above, the computer-based systems of the present invention 
comprise a data storage means having stored therein a nucleotide sequence of the 
present invention and the necessary hardware means and software means for 
supporting and implementing a search means. 

As used herein, "data storage means" refers to memory which can store 
nucleotide sequence information of the present invention, or a memory access 
means which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target 
structural motif with the sequence information stored within the data storage 
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means. Search means are used to identify fragments or regions of the present 
genomic sequences which match a particular target sequence or target motif. A 
variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the 
5 computer-based systems of the present invention. Examples of such software 
includes, but is not limited to, MacPattern (EMBL), BLASTN and BLASTX 
(NCBIA). A skilled artisan can readily recognize that any one of the available 
algorithms or implementing software packages for conducting homology searches 
can be adapted for use in the present computer-based systems. 

10 As used herein, a "target sequence" can be any DNA or amino acid 

sequence of six or more nucleotides or two or more amino acids. A skilled artisan 
can readily recognize that the longer a target sequence is, the less likely a target 
sequence will be present as a random occurrence in the database. The most 
preferred sequence length of a target sequence is from about 10 to 100 amino acids 

15 or from about 30 to 300 nucleotide residues. However, it is well recognized that 
searches for commercially important fragments, such as sequence fragments 
involved in gene expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) 

20 are chosen based on a three-dimensional configuration which is formed upon the 
folding of the target motif. There are a variety of target motifs known in the art. 
Protein target motifs include, but are not limited to, enzymic active sites and signal 
sequences. Nucleic acid target motifs include, but are not limited to, promoter 
sequences, hairpin structures and inducible expression elements (protein binding 

25 sequences). 

A variety of structural formats for the input and output means can be used 
to input and output the information in the computer-based systems of the present 
invention. A preferred format for an output means ranks fragments of the 
Streptococcus pneumoniae genomic sequences possessing varying degrees of 

30 homology to the target sequence or target motif. Such presentation provides a 
skilled artisan with a ranking of sequences which contain various amounts of the 
target sequence or target motif and identifies the degree of homology contained in 
the identified fragment. 

A variety of comparing means can be used to compare a target sequence or 

35 target motif with the data storage means to identify sequence fragments of the 
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Streptococcus pneumoniae genome. In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in 
Altschul et ai, J. Mol. Biol. 215: 403-410 (1990), is used to identify open reading 
frames within the Streptococcus pneumoniae genome. A skilled artisan can readily 
recognize that any one of the publicly available homology search programs can be 
used as the search means for the computer-based systems of the present invention. 
Of course, suitable proprietary systems that may be known to those of skill also 
may be employed in this regard. 

Figure 1 provides a block diagram of a computer system illustrative of 
embodiments of this aspect of present invention. The computer system 102 
includes a processor 106 connected to a bus 104. Also connected to the bus 104 
are a main memory 108 (preferably implemented as random access memory, RAM) 
and a variety of secondary storage devices 110, such as a hard drive 112 and a 
removable medium storage device 1 14. The removable medium storage device 1 14 
may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape 
drive, etc. A removable storage medium 1 16 (such as a floppy disk, a compact 
disk, a magnetic tape, etc.) containing control logic and/or data recorded therein 
may be inserted into the removable medium storage device 114. The computer 
system 102 includes appropriate software for reading the control logic and/or the 
data from the removable medium storage device 1 14, once it is inserted into the 
removable medium storage device 1 14. 

A nucleotide sequence of the present invention may be stored in a well 
known manner in the main memory 108, any of the secondary storage devices 1 10, 
and/or a removable storage medium 1 16. During execution, software for accessing 
and. processing the genomic sequence (such as search tools, comparing tools, etc.) 
reside in main memory 108, in accordance with the requirements and operating 
parameters of the operating system, the hardware system and the software program 
or programs. 
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BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention are directed to isolated 
fragments of the Streptococcus pneumoniae genome. The fragments of the 
5 Streptococcus pneumoniae genome of the present invention include, but are not 
limited to fragments which encode peptides and polypeptides, hereinafter open 
reading frames (ORFs), fragments which modulate the expression of an operably 
linked ORF, hereinafter expression modulating fragments (EMFs) and fragments 
which can be used to diagnose the presence of Streptococcus pneumoniae in a 

10 sample, hereinafter diagnostic fragments (DFs). ■ 

As used herein, an "isolated nucleic acid molecule" or an "isolated fragment 
of the Streptococcus pneumoniae genome" refers to a nucleic acid molecule 
possessing a specific nucleotide sequence which has been subjected to purification 
means to reduce, from the composition, the number of compounds which are 

15 normally associated with the composition. Particularly, the term refers to the 
nucleic acid molecules having the sequences set out in SEQ ID NOS: 1-391, to 
representative fragments thereof as described above, to polynucleotides at least 
95%, preferably at least 99% and especially preferably at least 99.9% identical in 
sequence thereto, also as set out above. 

20 A variety of purification means can be used to generate the isolated 

fragments of the present invention. These include, but are not limited to methods 
which separate constituents of a solution based on charge, solubility, or size. 

In one embodiment. Streptococcus pneumoniae DNA can be enzymatically 
sheared to produce fragments of 15-20 kb in length. These fragments can then be 

25 used to generate a Streptococcus pneumoniae library by inserting them into lambda 
clones as described in the Examples below. Primers flanking, for example, an 
ORF, such as those enumerated in Tables 1-3 can then be generated using 
nucleotide sequence information provided in SEQ ID NOS: 1-391. Well known 
and routine techniques of PCR cloning then can be used to isolate the ORF from 

30 the lambda DNA library or Streptococcus pneumoniae genomic DNA. Thus, given 
the availability of SEQ ID NOS: 1-391, the information in Tables 1, 2 and 3, and 
the information that may be obtained readily by analysis of the sequences of SEQ 
ID NOS: 1-391 using methods set out above, those of skill will be enabled by the 
present disclosure to isolate any ORF-containing or other nucleic acid fragment of 

35 the present invention. . 
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The isolated nucleic acid molecules of the present invention include, but are 
not limited to single stranded and double stranded DNA, and single stranded RN A. 

As used herein, an "open reading frame," ORF, means a series of triplets 
coding for amino acids without any termination codons and is a sequence 
5 translatable into protein. 

Tables 1, 2 t and 3 list ORFs in the Streptococcus pneumoniae genomic 
contigs of the present invention that were identified as putative coding regions by 
the GeneMark software using organism-specific second-order Markov probability 
transition matrices. It will be appreciated that other criteria can be used, in 
10 accordance with well known analytical methods, such as those discussed herein, to 
generate more inclusive, more restrictive, or more selective lists. 

Table 1 sets out ORFs in the Streptococcus pneumoniae contigs of the 
present invention that over a continuous region of at least 50 bases are 95% or 
more identical (by BLAST analysis) to a nucleotide sequence available through 
15 GenBank in October, 1997. 

Table 2 sets out ORFs in the Streptococcus pneumoniae contigs of the 
present invention that are not in Table 1 and match, with a BLASTP probability 
score of 0.01 or less, a polypeptide sequence available through GenBank in 
October, 1997. 

20 Table 3 sets out ORFs in the Streptococcus pneumoniae contigs of the 

present invention that do not match significantly, by BLASTP analysis, a 
polypeptide sequence available through GenBank in October, 1997. 

In each table, the first and second columns identify the ORF by, 
respectively, contig number and ORF number within the contig; the third column 

25 indicates the first nucleotide of the ORF (actually the first nucleotide of the stop 
codon immediately preceeding the ORF), counting from the 5' end of the contig 
strand; and the fourth column, "stop (nt)" indicates the last nucleotide of the stop 
codon defining the 3'end of the ORF. 

In Tables 1 and 2, column five, lists the Reference for the closest 

30 matching sequence available through GenBank. These reference numbers are the 
databases entry numbers commonly used by those of skill in the art, who will be 
familiar with their denominators. Descriptions of the nomenclature are available 
from the National Center for Biotechnology Information. Column six in Tables 1 
and 2 provides the gene name of the matching sequence; column seven provides 

35 the BLAST identity score and column eight the BLAST similarity score from the 
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comparison of the ORF and the homologous gene; and column nine indicates the 
length in nucleotides of the highest scoring segment pair identified by the BLAST 
identity analysis. 

Each ORF described in the tables is defined by "start (nt)" (5') and "stop 
5 (nt)" (3') nucleotide position numbers. These position numbers refer to the 
boundaries of each ORF and provide orientation with respect to whether the 
forward or reverse strand is the coding strand and which reading frame the coding 
sequence is contained. The "start" position is the first nucleotide of the triplet 
encoding a stop codon just 5' to the ORF and the "stop" position is the last 
10 nucleotide of the triplet encoding the next in-frame stop codon (i.e., the stop codon 
at the 3' end of the ORF). Those of ordinary skill in the art appreciate that 
preferred fragments within each ORF described in the table include fragments of 
each ORF which include the entire sequence from the delineated "start" and "stop" 
positions excepting the first and last three nucleotides since these encode stop 

15 codons. Thus, polynucleotides set out as ORFs in the tables but lacking the three 
(3) 5' nucleotides and the three (3) 3' nucleotides are encompassed by the present 
invention. Those of skill also appreciate that particularly preferred are fragments 
within each ORF that are polynucleotide fragments comprising polypeptide coding 
sequence. As defined herein, "coding sequence" includes the fragment within an 

20 ORF beginning at the first in-frame ATG (triplet encoding methionine) and ending 
with the last nucleotide prior to the triplet encoding the 3* stop codon. Preferred 
are fragments comprising the entire coding sequence and fragments comprising the 
entire coding sequence, excepting the coding sequence for the N-terminal 
methionine. Those of skill appreciate that the N-terminal methionine is often 

25 removed during post-translational processing and that polynucleotides lacking the 
ATG can be used to facilitate production of N-termainal fusion proteins which may 
be benefical in the production or use of genetically engineered proteins. Of course, 
due to the degeneracy of the genetic code many polynucleotides can encode a given 
polypeptide. Thus, the invention further includes polynucleotides comprising a 

30 nucleotide sequence encoding a polypeptide sequence itself encoded by the coding 
sequence within an ORF described in Tables 1-3 herein. Further, polynucleotides 
at least 95%, preferably at least 99% and especially preferably at least 99.9% 
identical in sequence to the foregoing polynucleotides, are contemplated by the 
present invention. 
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^ Polypeptides encoded by polynucleotides described above and elsewhere 
herein are also provided by the present invention as are polypeptide comprising a 
an amino acid sequence at least about 95%, preferably at least 97% and even more 
preferably 99% identical to the amino acid sequence of a polypeptide encoded by an 
ORF shown in Tables 1-3. These polypeptides may or may not comprise an N- 
terminal methionine. 

The concepts of percent identity and percent similarity of two polypeptide 
sequences is well understood in the art. For example, two polypeptides 10 amino 
acids in length which differ at three amino acid positions (e.g., at positions 1, 3 
and 5) are said to have a percent identity of 70%. However, the same two 
polypeptides would be deemed to have a percent similarity of 80% if, for example 
at position 5, the amino acids moieties, although not identical, were "similar" (i.e., 
possessed similar biochemical characteristics). Many programs for analysis of 
nucleotide or amino acid sequence similarity, such as fasta and BLAST specifically 
list percent identity of a matching region as an output parameter. Thus, for 
instance. Tables 1 and 2 herein enumerate the percent identity of the highest 
scoring segment pair in each ORF and its listed relative. Further details 
concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent literature highlighted by the citations 
20 provided below. 

It will be appreciated that other criteria can be used to generate more 
inclusive and more exclusive listings of the types set out in the tables. As those of 
skill will appreciate, narrow and broad searches both are useful. Thus, a skilled 
artisan can readily identify ORFs in contigs of the Streptococcus pneumoniae 
genome other than those listed in Tables 1-3, such as ORFs which are overlapping 
or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using the computer-based systems of the present invention. 

As used herein, an "expression modulating fragment," EMF, means a 
series of nucleotide molecules which modulates the expression of an operably 
30 linked ORF or EMF. 
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As used herein, a sequence is said to "modulate the expression of an 
operably linked sequence" when the expression of the sequence is altered by the 
presence of the EMF. EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs are 
fragments which induce the expression or an operably linked ORF in response to a 
specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Streptococcus 
pneumoniae genome by their proximity to the ORFs provided in Tables 1-3. An 
intergenic segment, or a fragment of the intergenic segment, from about 10 to 200 
nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate 
the expression of an operably linked ORF in a fashion similar to that found with the 
naturally linked ORF sequence. As used herein, an "intergenic segment" refers to 
fragments of the Streptococcus pneumoniae genome which are between two 
ORF(s) herein described. EMFs also can be identified using known EMFs as a 
target sequence or target motif in the computer-based systems of the present 
invention. Further, the two methods can be combined and used together. 

The presence and activity of an EMF can be confirmed using an EMF trap 
vector. An EMF trap vector contains a cloning site linked to a marker sequence. A 
marker sequence encodes an identifiable phenotype, such as antibiotic resistance or 
a complementing nutrition auxotrophic factor, which can be identified or assayed 
when the EMF trap vector is placed within an appropriate host under appropriate 
conditions. As described above, a EMF will modulate the expression of an 
operably linked marker sequence. A more detailed discussion of various marker 
sequences is provided below. A sequence which is suspected as being an EMF is 
25 cloned in all three reading frames in one or more restriction sites upstream from the 
marker sequence in the EMF trap vector. The vector is then transformed into an 
appropriate host using known procedures and the phenotype of the transformed 
host in examined under appropriate conditions. As described above, an EMF will 
modulate the expression of an operably linked marker sequence. 

As used herein, a "diagnostic fragment," DF, means a series of nucleotide 
molecules which selectively hybridize to Streptococcus pneumoniae sequences. 
DFs can be readily identified by identifying unique sequences within contigs of the 
Streptococcus pneumoniae genome, such as by using well-known computer 
analysis software, and by generating and testing probes or amplification primers 
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consisting of the DF sequence in an appropriate diagnostic format which 
determines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention are not 
limited to the specific sequences herein described, but also include allelic and 
species variations thereof. Allelic and species variations can be routinely 
determined by comparing the sequences provided in SEQ ID NOS: 1-391, a 
representative fragment thereof, or a nucleotide sequence at least 95%, preferrably 
at least 99% and most at least preferably 99.9% identical to SEQ ID NOS:l-391, 
with a sequence from another isolate of the same species. Furthermore, to 
accommodate codon variability, the invention includes nucleic acid molecules 
coding for the same amino acid sequences as do the specific ORFs disclosed 
herein. In other words, in the coding region of an ORF, substitution of one codon 
for another which encodes the same amino acid is expressly contemplated. Any 
specific sequence disclosed herein can be readily screened for errors by 
resequencing a particular fragment, such as an ORF, in both directions {i.e., 
sequence both strands). Alternatively, error screening can be performed by 
sequencing corresponding polynucleotides of Streptococcus pneumoniae origin 
isolated by using part or all of the fragments in question as a probe or primer. 

Preferred DFs of the present invention comprise at least about 17, 
preferrably at least about 20, and more preferrably at least about 50 contiguous 
nucleotides within an ORF set out in Tables 1-3. Most highly preferred DFs 
specifically hybridize to a polynucleotide containing the sequence of the ORF from 
which they are derived. Specific hybridization occurs even under stringent 
conditions defined elsewhere herein. 

Each of the ORFs of the Streptococcus pneumoniae genome disclosed in 
Tables 1, 2 and 3, and the EMFs found 5' to the ORFs, can be used as 
polynucleotide reagents in numerous ways. For example, the sequences can be 
used as diagnostic probes or diagnostic amplification primers to detect the presence 
of a specific microbe in a sample, particularly Streptococcus pneumoniae. 
30 Especially preferred in this regard are ORFs such as those of Table 3, which do not 
match previously characterized sequences from other organisms and thus are most 
likely to be highly selective for Streptococcus pneumoniae. Also particularly 
preferred are ORFs that can be used to distinguish between strains of Streptococcus 
pneumoniae, particularly those that distinguish medically important strain, such as 
35 drug-resistant strains. 
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In addition, the fragments of the present invention, as broadly described, 
can be used to control gene expression through triple helix formation or antisense 
DNA or RNA, both of which methods are based on the binding of a polynucleotide 
sequence to DNA or RNA. Triple helix-formation optimally results in a shut-off of 
5 RNA transcription from DNA, while antisense RNA hybridization blocks 
translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix- 
forming oligonucleotides. Polynucleotides suitable for use in these methods are 
usually 20 to 40 bases in length and are designed to be complementary to a region 
10 of the gene involved in transcription, for triple-helix formation, or to the mRNA 
itself, for antisense inhibition. Both techniques have been demonstrated to be 
effective in model systems, and the requisite techniques are well known and 
involve routine procedures. Triple helix techniques are discussed in, for example, 
Lee et al. % Nucl. Acids Res. 6:3073 (1979); Cooney et aL, Science 241:456 
15 (1988); and Dervan et aL Science 257:1360 (1991). Antisense techniques in 
general are discussed in, for instance, Okano, 7. Neurochem. 56:560 (1991) and 
Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL( 1988)). 

The present invention further provides recombinant constructs comprising 
20 one or more fragments of the Streptococcus pneumoniae genomic fragments and 
contigs of the present invention. Certain preferred recombinant constructs of the 
present invention comprise a vector, such as a plasmid or viral vector, into which a 
fragment of the Streptococcus pneumoniae genome has been inserted, in a forward 
or reverse orientation, In the case of a vector comprising one of the ORFs of the 
25 present invention, the vector may further comprise regulatory sequences, including 
for example, a promoter, operably linked to the ORF. For vectors comprising the 
EMFs of the present invention, the vector may further comprise a marker sequence 
or heterologous ORF operably linked to the EMF. 

Large numbers of suitable vectors and promoters are known to those of 
30 skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of 
example. Useful bacterial vectors include phagescript, PsiX174, pBluescript SK, 
pBS KS, pNH8a, pNH16a, pNH18a, pNH46a (available from Stratagene); 
P Trc99A, pKK223-3, pKK233-3, pDR540, pRITS (available from Pharmacia). 
35 Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXTl, pSG 
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(available from Stratagene) pSVK3, pBPV, pMSG. pSVL (available from 
Pharmacia). 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. 
Two appropriate vectors are pKK232-8 and pCMT. Particular named bacterial 
promoters include lacl. IacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic 
promoters include CMV immediate early, HSV thymidine kinase, early and late 
SV40, LTRs from retrovirus, and mouse metallothionein- I. Selection of the 
appropriate vector and promoter is well within the level of ordinary skill in the art. 

The present invention further provides host cells containing any one of the 
isolated fragments of the Streptococcus pneumoniae genomic fragments and 
contigs of the present invention, wherein the fragment has been introduced into the 
host cell using known methods. The host cell can be a higher eukaryotic host 
cell, such as a mammalian cell, a lower eukaryotic host cell, such as a yeast cell, or 
1 5 a procaryotic cell, such as a bacterial cell. 

A polynucleotide of the present invention, such as a recombinant construct 
comprising an ORFof the present invention, may be introduced into the host by a 
variety of well established techniques that are standard in the art, such as calcium 
phosphate transfection, DEAE, dextran mediated transfection and electroporation, 
which are described in, for instance, Davis, L. et al, BASIC METHODS IN 
MOLECULAR BIOLOGY (1986). 

A host cell containing one of the fragments of the Streptococcus 
pneumoniae genomic fragments and contigs of the present invention, can be used 
in conventional manners to produce the gene product encoded by the isolated 
fragment (in the case of an ORF) or can be used to produce a heterologous protein 
under the control of the EMF. The present invention further provides 

isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present 
invention. By "degenerate variant" is intended nucleotide fragments which differ 
from a nucleic acid fragment of the present invention {e.g., an ORF) by nucleotide 
sequence but, due to the degeneracy of the Genetic Code, encode an identical 
polypeptide sequence. 

Preferred nucleic acid fragments of the present invention are the ORFs and 
subfragments thereof depicted in Tables 2 and 3 which encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any 
one of the isolated polypeptides or proteins of the present invention. At the 
simplest level, the amino acid sequence can be synthesized using commercially 
available peptide synthesizers. This is particularly useful in producing small 
5 peptides and fragments of larger polypeptides. Such short fragments as may be 
obtained most readily by synthesis are useful, for example, in generating antibodies 
against the native polypeptide, as discussed further below. 

In an alternative method, the polypeptide or protein is purified from 
bacterial cells which naturally produce the polypeptide or protein. One skilled in 

10 the art can readily employ well-known methods for isolating polypeptides and 
proteins to isolate and purify polypeptides or proteins of the present invention 
produced naturally by a bacterial strain, or by other methods. Methods for 
isolation and purification that can be employed in this regard include/but are not 
limited to, immunochromatography, HPLC, size-exclusion chromatography, ion- 

1 5 exchange chromatography, and immuno-affinity chromatography. 

The polypeptides and proteins of the present invention also can be purified 
from cells which have been altered to express the desired polypeptide or protein. 
As used herein, a cell is said to be altered to express a desired polypeptide or 
protein when the cell, through genetic manipulation, is made to produce a 

20 polypeptide or protein which it normally does not produce or which the cell 
normally produces at a lower level. Those skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic 
sequences into eukaiyotic or prokaryotic cells in order to generate a cell which 
produces one of the polypeptides or proteins of the present invention. 

25 Any host/vector system can be used to express one or more of the ORFs of 

the present invention. These include, but are not limited to, eukaryotic hosts such 
as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic host 
such as £. coli and fl. subtilis. The most preferred cells are those which do not 
normally express the particular polypeptide or protein or which expresses the 

30 polypeptide or protein at low natural level. 
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"Recombinant," as used herein, means that a polypeptide or protein is* 
derived from recombinant (e.g., microbial or mammalian) expression systems. 
"Microbial" refers to recombinant polypeptides or proteins made in bacterial or 
fungal (e.g., yeast) expression systems. As a product, "recombinant 
5 microbiaTdefines a polypeptide or protein essentially free of native endogenous 
substances and unaccompanied by associated native glycosylation. Polypeptides or 
proteins expressed in most bacterial cultures, e.g., E. coli, will be free of 
glycosylation modifications; polypeptides or proteins expressed in yeast will have a 
glycosylation pattern different from that expressed in mammalian cells. 
10 "Nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides. 

Generally, DNA segments encoding the polypeptides and proteins provided by this 
invention are assembled from fragments of the Streptococcus pneumoniae genome 
and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a 
synthetic gene which is capable of being expressed in a recombinant transcriptional 
1 5 unit comprising regulatory elements derived from a microbial or viral operon. 

Recombinant expression vehicle or vector" refers to a plasmid or phage or 
virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. The 
expression vehicle can comprise a transcriptional unit comprising an assembly of 
(1) a genetic regulatory elements necessary for gene expression in the host, 
including elements required to initiate and maintain transcription at a level sufficient 
for suitable expression of the desired polypeptide, including, for example, 
promoters and, where necessary, an enhancer and a polyadenylation signal; (2) a 
structural or coding sequence which is transcribed into mRNA and translated into 
protein, and (3) appropriate signals to initiate translation at the beginning of the 
25 desired coding region and terminate translation at its end. Structural units intended 
for use in yeast or eukaryotic expression systems preferably include a leader 
sequence enabling extracellular secretion of translated protein by a host cell. 
Alternatively, where recombinant protein is expressed without a leader or transport 
sequence, it may include an N-terminal methionine residue. This residue may or 
30 may not be subsequently cleaved from the expressed recombinant protein to 
provide a final product. 

"Recombinant expression system" means host cells which have stably 
integrated a recombinant transcriptional unit into chromosomal DNA or carry the 
recombinant transcriptional unit extra chromosomally. The cells can be prokaryotic 
or eukaryotic. Recombinant expression systems as defined herein will express 
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heterologous polypeptides or proteins upon induction of the regulatory elements 
linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or 
other cells under the control of appropriate promoters. Cell-free translation 
5 systems can also be employed to produce such proteins using RNAs derived from 
the DNA constructs of the present invention. Appropriate cloning and expression 
vectors for use with prokaryotic and eukaryotic hosts are described in Sambrook et 
ai, Molecular Cloning: A Laboratory Manual, 2 nd Edition, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which 

1 0 is hereby incorporated by reference in its entirety. 

Generally, recombinant expression vectors will include origins of 
replication and selectable markers permitting transformation of the host cell, e.g., 
the ampicillin resistance gene of £. coli and 5. cerevisiae TRP1 gene, and a 
promoter derived from a highly expressed gene to direct transcription of a 

15 downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3- phosphoglycerate kinase (PGK), alpha- 
factor, acid phosphatase, or heat shock proteins, among others. The heterologous 
structural sequence is assembled in appropriate phase with translation initiation and 
termination sequences, and preferably, a leader sequence capable of directing 

20 secretion of translated protein into the periplasmic space or extracellular medium. 
Optionally, the heterologous sequence can encode a fusion protein including an N- 
terminal identification peptide imparting desired characteristics, e.g., stabilization 
or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a 

25 structural DNA sequence encoding a desired protein together with suitable 
translation initiation and termination signals in operable reading phase with a 
functional promoter. The vector will comprise one or more phenotypic selectable 
markers and an origin of replication to ensure maintenance of the vector and, when 
desirable, provide amplification within the host. 

30 Suitable prokaryotic hosts for transformation include strains of E. coli, B. 

subtilis, Salmonella typhimurium and various species within the genera 
Pseudomonas and Streptomyces. Others may, also be employed as a matter of 
choice. 

As a representative but non-limiting example, useful expression vectors for 
35 bacterial use can comprise a selectable marker and bacterial origin of replication 
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derived from commercially available plasmids comprising genetic elements of the 
well known cloning vector pBR322 (ATCC 37017). Such commercial vectors 
include, for example, pKK223-3 (available form Pharmacia Fine Chemicals, 
Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, WI, 
5 USA). These pBR322 "backbone" sections are combined with an appropriate 
promoter and the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host 
strain to an appropriate cell density, the selected promoter, where it is inducible, is 
derepressed or induced by appropriate means {e.g., temperature shift or chemical 
10 induction) and cells are cultured for an additional period to provide iot expression 
of the induced gene product. Thereafter cells are typically harvested, generally by 
centrifugation, disrupted to release expressed protein, generally by physical or 
chemical means, and the resulting crude extract is retained for further purification. 
Various mammalian cell culture systems can also be employed to express 

15 recombinant protein. Examples of mammalian expression systems include the 
COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23;\15 
( 1 98 1 ), and other cell lines capable of expressing a compatible vector, for example, 
the C127, 3T3, CHO, HeLa and BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a 

20 suitable promoter and enhancer, and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5' flanking nontranscribed sequences. DNA sequences derived 
from the S V40 viral genome, for example, S V40 origin, early promoter, enhancer, 
splice, and polyadenylation sites may be used to provide the required 

25 nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is 
usually isolated by initial extraction from cell pellets, followed by one or more 
salting-out, aqueous ion exchange or size exclusion chromatography steps. 
Microbial cells employed in expression of proteins can be disrupted by any 

30 convenient method, including freeze-thaw cycling, sonication, mechanical 
disruption, or use of cell lysing agents. Protein refolding steps can be used, as 
necessary, in completing configuration of the mature protein. Finally, high 
performance liquid chromatography (HPLC) can be employed for final purification 
steps. 
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The present invention further includes isolated polypeptides, proteins and 
nucleic acid molecules which are substantially equivalent to those herein described. 
As used herein, substantially equivalent can refer both to nucleic acid and amino 
acid sequences, for example a mutant sequence, that varies from a reference 

5 sequence by one or more substitutions, deletions, or additions, the net effect of 
which does not result in an adverse functional dissimilarity between reference and 
subject sequences. For purposes of the present invention, sequences having 
equivalent biological activity, and equivalent expression characteristics are 
considered substantially equivalent. For purposes of determining equivalence, 

10 truncation of the mature sequence should be disregarded. 

The invention further provides methods of obtaining homologs from other 
strains of Streptococcus pneumoniae, of the fragments of the Streptococcus 
pneumoniae genome of the present invention and homologs of the proteins encoded 
by the ORFs of the present invention. As used herein, a sequence or protein of 

15 Streptococcus pneumoniae is defined as a homolog of a fragment of the 
Streptococcus pneumoniae fragments or contigs or a protein encoded by one of the 
ORFs of the present invention, if it shares significant homology to one of the 
fragments of the Streptococcus pneumoniae genome of the present invention or a 
protein encoded by one of the ORFs of the present invention. Specifically, by 

20 using the sequence disclosed herein as a probe or as primers, and techniques such 
as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain 
homologs. 

As used herein, two nucleic acid molecules or proteins are said to "share 
significant homology" if the two contain regions which possess greater than 85% 

25 sequence (amino acid or nucleic acid) homology. Preferred homologs in this 
regard are those with more than 90% homology. Especially preferred are those 
with 93% or more homology. Among especially preferred homologs those with 
95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those 

30 are homologs with 99% or more homology. The most preferred homologs among 
these are those with 99.9% homology or more. It will be understood that, among 
measures of homology, identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence 
provided in SEQ ID NOS: 1-391 or from a nucleotide sequence at least 95%, 

35 particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ 
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IDNOS:l-391 can be used to prime DNA synthesis and PCR amplification, as 
well as to identify colonies containing cloned DNA encoding a homolog. Methods 
suitable to this aspect of the present invention are well known and have been 
described in great detail in many publications such as, for example, Innis et ai, 
PCR Protocols, Academic Press, S an Diego, CA ( 1 990)). 

When using primers derived from SEQ ID NOS: 1-391 or from a nucleotide 
sequence having an aforementioned identity to a sequence of SEQ ID NOS: 1-391, 
one skilled in the art will recognize that by employing high stringency conditions 
{e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 
65°C in 0.5X SSPC) only sequences which are greater than 75% homologous to 
the primer will be amplified. By employing lower stringency conditions (e.g., 
hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C 
in 0.5X SSPC), sequences which are greater than 40-50% homologous to the 
primer will also be amplified. 

When using DNA probes derived from SEQ ID NOS:l-391, or from a 
nucleotide sequence having an aforementioned identity to a sequence of SEQ ID 
NOS: 1-391, for colony/plaque hybridization, one skilled in the art will recognize 
that by employing high stringency conditions (e.g., hybridizing at 50- 65°C in 5X 
SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC), sequences 
having regions which are greater than 90% homologous to the probe can be 
obtained, and that by employing lower stringency conditions (e.g., hybridizing at 
35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X 
SSPC), sequences having regions which are greater than 35-45% homologous to 
the probe will be obtained. 

Any organism can be used as the source for homologs of the present 
invention so long as the organism naturally expresses such a protein or contains 
genes encoding the same. The most preferred organism for isolating homologs are 
bacteria which are closely related to Streptococcus pneumoniae. 

ILLUSTRATIVE USES OF COMPOSITIONS OF THE 
INVENTION 

Each ORF provided in Tables 1 and 2 is identified with a function by 
homology to a known gene or polypeptide. As a result, one skilled in the art can 
use the polypeptides of the present invention for commercial, therapeutic and 
industrial purposes consistent with the type of putative identification of the 
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polypeptide. Such identifications permit one skilled in the art to use the 
Streptococcus pneumoniae ORFs in a manner similar to the known type of 
sequences for which the identification is made; for example, to. foment a particular 
sugar source or to produce a particular metabolite. A variety of reviews illustrative 
5 of this aspect of the invention are available, including the following reviews on the 
industrial use of enzymes, for example, BIOCHEMICAL ENGINEERING AND 
BIOTECHNOLOGY HANDBOOK, 2nd Ed., MacMillan Publications, Ltd. NY 
(1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper et aL, Eds., 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). A variety of 
10 exemplary uses that illustrate this and similar aspects of the present invention are 
discussed below. 

1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic 

15 reactions involved in intermediary and macromolecular metabolism, the 
biosynthesis of small molecules, cellular processes and other functions includes 
enzymes involved in the degradation of the intermediary products of metabolism, 
enzymes involved in central intermediary metabolism, enzymes involved in 
respiration, both aerobic and anaerobic, enzymes involved in fermentation, 

20 enzymes involved in ATP proton motor force conversion, enzymes involved in 
broad regulatory function, enzymes involved in amino acid synthesis, enzymes 
involved in nucleotide synthesis, enzymes involved in cofactor and vitamin 
synthesis, can be used for industrial biosynthesis. 

The various metabolic pathways present in Streptococcus pneumoniae can 

25 be identified based on absolute nutritional requirements as well as by examining the 
various enzymes identified in Table 1-3 and SEQ ID NOS: 1-391. 

Of particular interest are polypeptides involved in the degradation of 
intermediary metabolites as well as non-macromolecular metabolism. Such 
enzymes include amylases, glucose oxidases, and catalase. 

30 Proteolytic enzymes are another class of commercially important enzymes. 

Proteolytic enzymes find use in a number of industrial processes including the 
processing of flax and other vegetable fibers, in the extraction, clarification and 
depectinization of fruit juices, in the extraction of vegetables* oil and in the 
maceration of fruits and vegetables to give unicellular fruits. A detailed review of 

35 the proteolytic enzymes. used in the food industry is provided in Rombouts et a/.. 
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Symbiosis 21:19 (1986) and Voragen et al in Biocatalysts In Agricultural 
Biotechnology, Whitaker et al, Eds., American Chemical Society Symposium 
Series 389:93 (1989) . 

The metabolism of sugars is an important aspect of the primary metabolism 
5 of Streptococcus pneumoniae. Enzymes involved in the degradation of sugars, 
such as, particularly, glucose, galactose, fructose and xylose, can be used in 
industrial fermentation. Some of the important sugar transforming enzymes, from 
a commercial viewpoint, include sugar isomerases such as glucose isomerase. 
Other metabolic enzymes have found commercial use such as glucose oxidases 
10 which produces ketogulonic acid (KGA). KGA is an intermediate in the 
commercial production of ascorbic acid using the Reichstein's procedure, as 
described in Krueger et al.. Biotechnology 61A}, Rhine et al, Eds., Verlag Press, 
Weinheim, Germany ( 1984). 

Glucose oxidase (GOD) is commercially available and has been used in 
purified form as well as in an immobilized form for the deoxygenation of beer. 
See, for instance, Hartmeir et al, Biotechnology Letters 1:21 (1979). The most 
important application of GOD is the industrial scale fermentation of gluconic acid. 
Market for gluconic acids which are used in the detergent, textile, leather, 
photographic, pharmaceutical, food, feed and concrete industry, as described, for 
20 example, in Bigelis et al, beginning on page 357 in GENE MANIPULATIONS 
AND FUNGI; Benett et al, Eds., Academic Press, New York ( 1985). In addition 
to industrial applications, GOD has found applications in medicine for quantitative 
determination of glucose in body fluids recently in biotechnology for analyzing 
syrups from starch and cellulose hydrosylates. This application is described in 
25 Owusu era/., Biochem. et Biophysica. Acta. 872:83 (1986), for instance. 

The main sweetener used in the world today is sugar which comes from 
sugar beets and sugar cane. In the field of industrial enzymes, the glucose 
isomerase process shows the largest expansion in the market today. Initially, 
soluble enzymes were used and later immobilized enzymes were developed 
(Krueger et al, Biotechnology, The Textbook of Industrial Microbiology, Sinauer 
Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of 
glucose- produced high fructose syrups is by far the largest industrial business 
using immobilized enzymes. A review of the industrial use of these enzymes is 
provided by Jorgensen, Starch 40:307 (1988). 
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Proteinases, such as alkaline serine proteinases, are used as detergent 
additives and thus represent one of the largest volumes of microbial enzymes used 
in the industrial sector. Because of their industrial importance, there is a large body 
of published and unpublished information regarding the use of these enzymes in 
5 industrial processes. (See Faultman et a/., Acid Proteases Structure Function and 
Biology, Tang, J., ed., Plenum Press, New York (1977) and Godfrey et al. t 
Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner et a/., 
Report Industrial Enzymes by 1990, Hel Hepner & Associates, London (1986)). 
Another class of commercially usable proteins of the present invention are 
10 the microbial lipases, described by, for instance, Macrae et a/.. Philosophical 
Transactions of the Chiral Society of London 310:221 (1985) and Poserke, Journal 
of the American Oil Chemist Society 61: 1758 (1984). A major use of lipases is in 
the fat and oil industry for the production of neutral glycerides using lipase 
catalyzed inter-esterification of readily available triglycerides. Application of 
15 lipases include the use as a detergent additive to facilitate the removal of fats from 
fabrics in the course of the washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for 
key steps in the synthesis of complex organic molecules is gaining popularity at a 
great rate. One area of great interest is the preparation of chiral intermediates. 
20 Preparation of chiral intermediates is of interest to a wide range of synthetic 
chemists particularly those scientists involved with the preparation of new 
pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et a/., Recent 
Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, 
Boca Raton, Florida (1990)). The following reactions catalyzed by enzymes are of 
25 interest to organic chemists: hydrolysis of carboxylic acid esters, phosphate esters, 
amides andnitriles, esterification reactions, trans-esterification reactions, synthesis 
of amides, reduction of alkanones and oxoalkanates, oxidation of alcohols to 
carbonyl compounds, oxidation of sulfides to sulfoxides, and carbon bond forming 
reactions such as the aldol reaction. 
30 When considering the use of an enzyme encoded by one of the ORFs of the 

present invention for biotransformation and organic synthesis it is sometimes 
necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme. Pros and cons of using a whole 
cell system on the one hand or an isolated partially purified enzyme on the other 
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hand, has been described in detail by Bud et aL, Chemistry in Britain (1987), p. 
127. 

Amino transferases, enzymes involved in the biosynthesis and metabolism 
of amino acids, are useful in the catalytic production of amino acids. The 
5 advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and 
generally possess uniformly high catalytic rates. A description of the use of amino 
transferases for amino acid production is provided by Roselle-David, Methods of 
Enzymology 136:479 (1987). 
10 Another category of useful proteins encoded by the ORFs of the present 

invention include enzymes involved in nucleic acid synthesis, repair, and 
recombination. 

2. Generation of Antibodies 

15 As described here, the proteins of the present invention, as well as 

homologs thereof, can be used in a variety of procedures and methods known in 
the art which are currently applied to other proteins. The proteins of the present 
invention can further be used to generate an antibody which selectively binds the 
protein. Such antibodies can be either monoclonal or polyclonal antibodies, as well 

20 fragments of these antibodies, and humanized forms. 

The invention further provides antibodies which selectively bind to one of 
the proteins of the present invention and hybridomas which produce these 
antibodies. A hybridoma is an immortalized cell line which is capable of secreting 
a specific monoclonal antibody. 

25 In general, techniques for preparing polyclonal and monoclonal antibodies 

as well as hybridomas capable of producing the desired antibody are well known in 
the art (Campbell, A. M, Monoclonal Antibody Technology: Laboratory 
Techniques In Biochemistry And Molecular Biology, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1984); St. Groth et a/., 7. Immunol Methods 35: 1- 

30 21 (1980), Kohler and Milstein, Nature 256:495-497 (1975)), the trioma 
technique, the human B-cell hybridoma technique (Kozbor et al., Immunology 
Today 4:12 (1983), pgs. 77-96 of Cole et ai y in Monoclonal Antibodies And 
Cancer Therapy, Alan R. Liss, Inc. (1985)). * ' Any animal (mouse, rabbit, 
etc.) which is known to produce antibodies can be immunized with the pseudogene 

35 polypeptide. Methods for immunization are well known in the art. Such methods 
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include subcutaneous or interperitoneal injection of the polypeptide. One skilled in 
the art will recognize that the amount of the protein encoded by the ORF of the 
present invention used for immunization will vary based on the animal which is 
immunized, the antigenicity of the peptide and the site of injection. 
5 The protein which is used as an immunogen may be modified or 

administered in an adjuvant in order to increase the protein's antigenicity. Methods 
of increasing the antigenicity of a protein are well known in the art and include, but 
are not limited to coupling the antigen with a heterologous protein (such as globulin 
or galactosidase) or through the inclusion of an adjuvant during immunization. 

10 For monoclonal antibodies, spleen cells from the immunized animals are 

removed, fused with myeloma cells, such as SP2/0-Agl4 myeloma cells, and 
allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used to 
identify the hybridoma cell which produces an antibody with the desired 

15 characteristics. These include screening the hybridomas with an ELISA assay, 
western blot analysis, or radioimmunoassay (Lutz et al. f Exp. Cell Res. 775:109- 
124(1988)). 

Hybridomas secreting the desired antibodies are cloned and the class and 
subclass is determined using procedures known in the art (Campbell, A. M., 
20 Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and 
Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands 
(1984)). 

Techniques described for the production of single chain antibodies (U. S. 
Patent 4,946,778) can be adapted to produce single chain antibodies to proteins of 
25 the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the 
immunized animal and is screened for the presence of antibodies with the desired 
specificity using one of the above-described procedures. 

The present invention further provides the above- described antibodies in 
30 detectably labelled form. Antibodies can be detectably labelled through the use of 
radioisotopes, affinity labels (such as biotin, avidin, etc.), enzymatic labels (such 
as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as 
FITC or rhodamine, etc.\ paramagnetic atoms, etc. Procedures for accomplishing 
such labeling are well-known in the art, for example see Sternberger et a/., J. 
35 Histochem. Cytochem. 75/315 (1970); Bayer, E. A. et a/., Meth. Enzym. 62:308 
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(1979); Engval, E. et aL, Immunol 109:129 (1972); Goding t J. W., J. Immunol. 
Meth. 13:215 (1976)). 

The labeled antibodies of the present invention can be used for in vitro, in 
vivo, and in situ assays to identify cells or tissues in which a fragment of the 
5 Streptococcus pneumoniae genome is expressed. 

The present invention further provides the above-described antibodies 
immobilized on a solid support. Examples of such solid supports include plastics 
such as polycarbonate, complex carbohydrates such as agarose and sepharose, 
aciylic resins and such as polyacrylamide and latex beads. Techniques for 

10 coupling antibodies to such solid supports are well known in the art (Weir, D. M. 
et aL, "Handbook of Experimental Immunology" 4th Ed., Blackwell Scientific 
Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. et a/., Meth. 
Enzym. 34 Academic Press, N. Y. (1974)). The immobilized antibodies of the 
present invention can be used for in vitro, in vivo, and in situ assays as well as for 

1 5 immunoaffinity purification of the proteins of the present invention. 

3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression 
of one of the ORFs of the present invention, or homolog thereof, in a test sample, 

20 using one of the DFs or antibodies of the present invention. 

In detail, such methods comprise incubating a test sample with one or more 
of the antibodies or one or more of the DFs of the present invention and assaying 
for binding of the DFs or antibodies to components within the test sample. 

Conditions for incubating a DF or antibody with a test sample vary. 

25 Incubation conditions depend on the format employed in the assay, the detection 
methods employed, and the type and nature of the DF or antibody used in the 
assay. One skilled in the art will recognize that any one of the commonly available 
hybridization, amplification or immunological assay formats can readily be adapted 
to employ the DFs or antibodies of the present invention. Examples of such assays 

30 can be found in Chard, T., An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); 
Bullock, G. R. et aL, Techniques in Immunocytochemistry, Academic Press, 
Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and 
Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry and 
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Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands 
(1985). 

The test samples of the present invention include cells, protein or membrane 
extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or 
5 urine. The test sample used in the above-described method will vary based on the 
assay format, nature of the detection method and the tissues, cells or extracts used 
as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to 
obtain a sample which is compatible with the system utilized. 

10 In another embodiment of the present invention, kits are provided which 

contain the necessary reagents to carry out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive, in 
close confinement, one or more containers which comprises: (a) a first container 
comprising one of the DFs or antibodies of the present invention; and (b) one or 

15 more other containers comprising one or more of the following: wash reagents, 
reagents capable of detecting presence of a bound DF or antibody. 

In detail, a compartmentalized kit includes any kit in which reagents are 
contained in separate containers. Such containers include small glass containers, 
plastic containers or strips of plastic or paper. Such containers allows one to 

20 efficiently transfer reagents from one compartment to another compartment such 
that the samples and reagents are not cross-contaminated, and the agents or 
solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept 
the test sample, a container which contains the antibodies used in the assay, 

25 containers which contain wash reagents (such as phosphate buffered saline, Tris- 
buffers, e/c), and containers which contain the reagents used to detect the bound 
antibody or DF. 

Types of detection reagents include labelled nucleic acid probes, labelled 
secondary antibodies, or in the alternative, if the primary antibody is labelled, the 
30 enzymatic, or antibody binding reagents which are capable of reacting with the 
labelled antibody. One skilled in the art will readily recognize that the disclosed 
DFs and antibodies of the present invention can be readily incorporated into one of 
the established kit formats which are well known in the art. 

35 4. Screening. Assay for Binding Agents 
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Using the isolated proteins of the present invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a 
protein encoded by one of the ORFs of the present invention or to one of the 
fragments and the Streptococcus pneumoniae fragment and contigs herein 
described. 

In general, such methods comprise steps of: 

(a) contacting an agent with an isolated protein encoded by one of the 
ORFs of the present invention, or an isolated fragment of the Streptococcus 
pneumoniae genome; and 

(b) determining whether the agent binds to said protein or said fragment. 
The agents screened in the above assay can be, but are not limited to, 

peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents. The 
agents can be selected and screened at random or rationally selected or designed 
using protein modeling techniques. 
15 For random screening, agents such as peptides, carbohydrates, 

pharmaceutical agents and the like are selected at random and are assayed for their 
ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used 
herein, an agent is said to be "rationally selected or designed" when the agent is 
chosen based on the configuration of the particular protein. For example, one 
skilled in the art can readily adapt currently available procedures to generate 
peptides, pharmaceutical agents and the like capable of binding to a specific peptide 
sequence in order to generate rationally designed antipeptide peptides, for example 
see Hurby et al, "Application of Synthetic Peptides: Antisense Peptides," in 
Synthetic Peptides, A User's Guide, W. H. Freeman, NY (1992), pp. 289-307, 
and Kaspczake/ al., Biochemistry 25:9230-8 (1989), or pharmaceutical agents, or 
the like. 

In addition to the foregoing, one class of agents of the present invention, as 
broadly described, can be used to control gene expression through binding to one 
of the ORFs or EMFs of the present invention. As described above, such agents 
can be randomly screened or rationally designed/selected. Targeting the ORF or 
EMF allows a skilled artisan to design sequence specific or element specific agents, 
modulating the expression of either a single ORF or multiple ORFs which rely on 
the same EMF for expression control. 
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One class of DNA binding agents are agents which contain base residues 
which hybridize or form a triple helix by binding to DNA or RNA. Such agents 
can be based on the classic phosphodiester, ribonucleic acid backbone, or can be a 
variety of sulfhydryl or polymeric derivatives which have base attachment capacity. 
5 Agents suitable for use in these methods usually contain 20 to 40 bases and 

are designed to be complementary to a region of the gene involved in transcription 
(triple helix - see Lee et al y Nucl. Acids Res. 6:3073 (1979); Cooney et ai t 
Science 241:456 (1988); and Dervan et a/., Science 257:1360 (1991)) or to the 
mRNA itself (antisense - Okano, 7. Neurochem. 56:560 (1991); 

10 Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL (1988)). Triple helix- formation optimally results in a shut-off of 
RNA transcription from DNA, while antisense RNA hybridization blocks 
translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the 

1 5 sequences of the present invention can be used to design antisense and triple helix- 
forming oligonucleotides, and other DNA binding agents. 

5. Pharmaceutical Compositions and Vaccines 

The present invention further provides pharmaceutical agents which can be 

20 used to modulate the growth or pathogenicity of Streptococcus pneumoniae, or 
another related organism, in vivo or in vitro. As used herein, a "pharmaceutical 
agent" is defined as a composition of matter which can be formulated using known 
techniques to provide a pharmaceutical compositions. As used herein, the 
"pharmaceutical agents of the present invention" refers the pharmaceutical agents 

25 which are derived from the proteins encoded by the ORFs of the present invention 
or are agents which are identified using the herein described assays. 

As used herein, a pharmaceutical agent is said to "modulate the growth 
pathogenicity of Streptococcus pneumoniae or a related organism, in vivo or in 
vitro" when the agent reduces the rate of growth, rate of division, or viability of 

30 the organism in question. The pharmaceutical agents of the present invention can 
modulate the growth or pathogenicity of an organism in many fashions, although 
an understanding of the underlying mechanism of action is not needed to practice 
the use of the pharmaceutical agents of the present invention. Some agents will 
modulate the growth by binding to an important protein thus blocking the biological 

35 activity of the protein, while other agents may bind to a component of the outer 
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surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a 
protein encoded by one of the ORFs of the present invention and serve as a 
vaccine. The development and use of a vaccine based on outer membrane 
components are well known in the art. 

As used herein, a "related organism" is a broad term which refers to any 
organism whose growth can be modulated by one of the pharmaceutical agents of 
the present invention. In general, such an organism will contain a homolog of the 
protein which is the target of the pharmaceutical agent or the protein used as a 
vaccine. As such, related organisms do not need to be bacterial but may be fungal 
or viral pathogens. 

The pharmaceutical agents and compositions of the present invention may 
be administered in a convenient manner, such as by the oral, topical, intravenous, 
intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal routes. The 
15 pharmaceutical compositions are administered in an amount which is effective for 
treating and/or prophylaxis of the specific indication. In general, they are 
administered in an amount of at least about 1 mg/kg body weight and in most cases 
they will be administered in an amount not in excess of about 1 g/kg body weight 
per day. In most cases, the dosage is from about 0.1 mg/kg to about 10 g/kg body 
weight daily, taking into account the routes of administration, symptoms, etc. 

The agents of the present invention can be used in native form or can be 
modified to form a chemical derivative. As used herein, a molecule is said to be a 
"chemical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the 
25 molecule's solubility, absorption, biological half life, etc. The moieties may 
alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, etc. Moieties capable of mediating such 
effects are disclosed in, among other sources, REMINGTON'S 
PHARMACEUTICAL SCIENCES (1980) cited elsewhere herein. 
30 For example, such moieties may change an immunological character of the 

functional derivative, such as affinity for a given antibody. Such changes in 
immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox 
or thermal stability, biological half-life, hydrophobicity, susceptibility to proteolytic 
35 degradation or the tendency to aggregate with carriers or into multimers also may 
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be effected in this way and can be assayed by methods well known to the skilled 
artisan. 

The therapeutic effects of the agents of the present invention may be 
obtained by providing the agent to a patient by any suitable means (e.g., inhalation, 
5 intravenously, intramuscularly, subcutaneously, enterally, or parenterally). It is 
preferred to administer the agent of the present invention so as to achieve an 
effective concentration within the blood or tissue in which the growth of the 
organism is to be controlled. To achieve an effective blood concentration, the 
preferred method is to administer the agent by injection. The administration may be 

10 by continuous infusion, or by single or multiple injections. 

In providing a patient with one of the agents of the present invention, the 
dosage of the administered agent will vary depending upon such factors as the 
patient's age, weight, height, sex, general medical condition, previous medical 
history, etc. In general, it is desirable to provide the recipient with a dosage of 

15 agent which is in the range of from about 1 pg/kg to 10 mg/kg (body weight of 
patient), although a lower or higher dosage may be administered. The 
therapeutically effective dose can be lowered by using combinations of the agents 
of the present invention or another agent. 

As used herein, two or more compounds or agents are said to be 

20 administered "in combination" with each other when either (1) the physiological 
effects of each compound, or (2) the serum concentrations of each compound can 
be measured at the same time. The composition of the present invention can be 
administered concurrently with, prior to, or following the administration of the 
other agent. 

25 The agents of the present invention are intended to be provided to recipient 

subjects in an amount sufficient to decrease the rate of growth (as defined above) of 
the target organism. 

The administration of the agent(s) of the invention may be for either a 
"prophylactic" or "therapeutic" purpose. When provided prophylactically, the 

30 agent(s) are provided in advance of any symptoms indicative of the organisms 
growth. The prophylactic administration of the agent(s) serves to prevent, 
attenuate, or decrease the rate of onset of any subsequent infection. When 
provided therapeutically, the agent(s) are provided at (or shortly after) the onset of 
an indication of infection. The therapeutic administration of the compound(s) 
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serves to attenuate the pathological symptoms of the infection and to increase the 
rate of recovery. 

The agents of the present invention are administered to a subject, such as a 
mammal, or a patient, in a pharmaceutically acceptable form and in a therapeutically 
effective concentration. A composition is said to be "pharmacologically acceptable" 
if its administration can be tolerated by a recipient patient. Such an agent is said to 
be administered in a "therapeutically effective amount" if the amount administered 
is physiologically significant. An agent is physiologically significant if its presence 
results in a detectable change in the physiology of a recipient patient. 

The agents of the present invention can be formulated according to known 
methods to prepare pharmaceutically useful compositions, whereby these materials, 
or their functional derivatives, are combined in a mixture with a pharmaceutically 
acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of 
other human proteins, e.g., human serum albumin, are described, for example in 
15 REMINGTON'S PHARMACEUTICAL SCIENCES, 16* Ed., Osol, A., Ed., 
Mack Publishing, Easton PA (1980). In order to form a pharmaceutically 
acceptable composition suitable for effective administration, such compositions will 
contain an effective amount of one or more of the agents of the present invention, 
together with a suitable amount of carrier vehicle. 

Additional pharmaceutical methods may be employed to control the duration 
of action. Control release preparations may be achieved through the use of 
polymers to complex or absorb one or more of the agents of the present invention. 
The controlled delivery may be effectuated by a variety of well known techniques, 
including formulation with macromolecules such as, for example, polyesters, 
polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcellulose, 
carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the 
macromolecules and the agent in the formulation, and by appropriate use of 
methods of incorporation, which can be manipulated to effectuate a desired time 
course of release. Another possible method to control the duration of action by 
controlled release preparations is to incorporate agents of the present invention into 
particles of a polymeric material such as polyesters, polyamino acids, hydrogels, 
polydactic acid) or ethylene vinylacetate copolymers. Alternatively, instead of 
incorporating these agents into polymeric particles, it is possible to entrap these . 
materials in microcapsules prepared, for example, by coacervation techniques or by 
35 interfacial polymerization with, for example, hydroxymethylcellulose or gelatine- 
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microcapsules and poly(methylmethacylate) microcapsules, respectively, or in 
colloidal drug delivery systems, for example, liposomes, albumin microspheres, 
microemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such 
techniques are disclosed in REMINGTONS PHARMACEUTICAL SCIENCES 
5 (1980). 

The invention further provides a pharmaceutical pack or kit comprising one 
or more containers filled with one or more of the ingredients of the pharmaceutical 
compositions of the invention. Associated with such containers) can be a notice in 
the form prescribed by a governmental agency regulating the manufacture, use or 
10 sale of pharmaceuticals or biological products, which notice reflects approval by 
the agency of manufacture, use or sale for human administration. 

In addition, the agents of the present invention may be employed in 
conjunction with other therapeutic compounds. 

15 6 - Shot-Gun Approach to Megabase DNA Sequencing 

The present invention further demonstrates that a large sequence can be 
sequenced using a random shotgun approach. This procedure, described in detail 
in the examples that follow, has eliminated the up front cost of isolating and 
ordering overlapping or contiguous subclones prior to the start of the sequencing 
20 protocols. 

Certain aspects of the present invention are described in greater detail in the 
examples that follow. The examples are provided by way of illustration. Other 
aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill in the art from reading the present 
25 disclosure. 

ILLUSTRATIVE EXAMPLES 

LIBRARIES AND SEQUENCING 
30 L Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing 
follows from the Lander and Waterman (Landerman and Waterman, Genomics 
2:231 (1988)) application of the equation for the Poisson distribution. According 
to this treatment, the probability, P , that any given base in a sequence of size L, in 
35 nucleotides, is not sequenced after a certain amount, n, in nucleotides, of random 
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sequence has been determined can be calculated by the equation P = e - m , where m 
is L/n, the fold coverage. For instance, for a genome of 2.8 Mb, m=l when 2.8 
Mb of sequence has been randomly generated (IX coverage). Apthat point, P = 
e- 1 = 0.37. The probability that any given base has not been sequenced is the same 
as the probability that any region of the whole sequence L has not been determi8ed 
and, therefore, is equivalent to the fraction of the whole sequence that has yet to be 
determined. Thus, at one-fold coverage, approximately 37% of a polynucleotide of 
size L, in nucleotides has not been sequenced. When 14 Mb of sequence has been 
generated, coverage is 5X for a 2.8 Mb and the unsequenced fraction drops to 
.0067 or 0.67%. 5X coverage of a 2.8 Mb sequence can be attained by sequencing 
approximately 17,000 random clones from both insert ends with an average 
sequence read length of 410 bp. 

Similarly, the total gap length, G, is determined by the equation G = Le-m 
and the average gap size, g, follows the equation, g = L/n. Thus, 5X coverage 
15 leaves about 240 gaps averaging about 82 bp in size in a sequence of a 
polynucleotide 2.8 Mb long. 

The treatment above is essentially that of Lander and Waterman, Genomics 
2:231 (1988). 

20 2. Random Library Construction 

In order to approximate the random model described above during actual 
sequencing, a nearly ideal library of cloned genomic fragments is required. The 
following library construction procedure was developed to achieve this end. 

Streptococcus pneumoniae DNA is prepared by phenol extraction. A 
25 mixture containing 200 ug DNA in 1 .0 ml of 300 mM sodium acetate, 10 mM Tris- 
HC1, 1 mM Na-EDTA, 50% glycerol is processed through a nebulizer (IPI Medical 
Products) with a stream of nitrogen adjusted to 35 Kpa for 2 minutes. The 
sonicated DNA is ethanol precipitated and redissolved in 500 ul TE buffer. 

To create blunt-ends, a 100 ul aliquot of the resuspended DNA is digested 
with 5 units of BAL31 nuclease (New England BioLabs) for 10 min at 30 6 C in 200 
ul BAL31 buffer. The digested DNA is phenol-extracted, ethanol-precipitated, 
redissolved in 100 ul TE buffer, and then size-fractionated by electrophoresis 
through a 1.0% low melting temperature agarose gel. The section containing DNA 
fragments 1 .6-2.0 kb in size is excised from the gel, and the LGT agarose is melted 
and the resulting solution is extracted with phenol to separate the agarose from the 
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DNA. DNA is ethanol precipitated and redissolved in 20 yd of TE buffer for 
ligation to vector. 

A two-step ligation procedure is used to produce a plasmid library with 
97% inserts, of which >99% were single inserts. The first ligation mixture (50 ul) 
5 contains 2 [ig of DNA fragments, 2 ng pUC18 DNA (Pharmacia) cut with Smal 
and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase 
(GIBCO/BRL) and is incubated at 14°C for 4 hr. The ligation mixture then is 
phenol extracted and ethanol precipitated, and the precipitated DNA is dissolved in 
20 |il TE buffer and electrophoresed on a 1.0% low melting agarose gel. Discrete 

10 bands in a ladder are visualized by ethidium bromide-staining and UV illumination 
and identified by size as insert (I), vector (v), v+I, v+2i, v+3i, etc. The portion of 
the gel containing v+I DNA is excised and the v+I DNA is recovered and 
resuspended into 20 ^1 TE. The v+I DNA then is blunt-ended by T4 polymerase 
treatment for 5 min. at 37°C in a reaction mixture (50 ul) containing the v+I linears, 

15 500 |iM each of the 4 dNTPs, and 9 units of T4 polymerase (New England 
BioLabs), under recommended buffer conditions. After phenol extraction and 
ethanol precipitation the repaired v+I linears are dissolved in 20 |xl TE. The final 
ligation to produce circles is carried out in a 50 ^il reaction containing 5 of v+I 
linears and 5 units of T4 ligase at 14°C overnight. After 10 min. at 70°C the 

20 following day, the reaction mixture is stored at -20°C. 

This two-stage procedure results in a molecularly random collection of 
single-insert plasmid recombinants with minimal contamination from double-insert 
chimeras (<1%) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in 

25 the host, E. coli host cells deficient in all recombination and restriction functions 
(A. Greener, Strategies 3 (1):5 (1990)) are used to prevent rearrangements, 
deletions, and loss of clones by restriction. Furthermore, transformed cells are 
plated direcdy on antibiotic diffusion plates to avoid the usual broth recovery phase 
which allows multiplication and selection of the most rapidly growing cells. 

30 Plating is carried out as follows. A 100 \il aliquot, of Epicurian Coli SURE 

II Supercompetent Cells (Stratagene 200152) is thawed on ice and transferred to a 
chilled Falcon 2059 tube on ice. A 1.7 (j.1 aliquot of 1.42 M beta-mercaptoethanol 
is added to the aliquot of cells to a final concentration of 25 mM. Cells are 
incubated on ice for 10 min. A 1 |xl aliquot of the final ligation is added to the cells 

35 and incubated on ice for 30 min. The cells are heat pulsed for 30 sec. at 42°C and 
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placed back on ice for 2 min. The outgrowth period in liquid culture is eliminated 
from this protocol in order to minimize the preferential growth of any given 
transformed cell. Instead the transformation mixture is plated directly on a nutrient 
rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 20 g 
5 tryptone, 5 g yeast extract, 0.5 g NaCl, 1 .5% Difco Agar per liter of media). The 5 
ml bottom layer is supplemented with 0.4 ml of 50 mg/ml ampicillin per 100 ml 
SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml X-Gal 
(2%), 1 ml MgCl (1 M), and 1 ml MgSO /100 ml SOB agar. The 15 ml top layer 
is poured just prior to plating. Our titer is approximately 100 colonies/10 ul aliquot 
10 of transformation? 4 

All colonies are picked for template preparation regardless of size. Thus, 
only clones lost due to "poison" DNA or deleterious gene products are deleted from 
the library, resulting in a slight increase in gap number over that expected. 

15 3. Random DNA Sequencing 

High quality double stranded DNA plasmid templates are prepared using a 
"boiling bead" method developed in collaboration with Advanced Genetic 
Technology Corp. (Gaithersburg, MD) (Adams et al., Science 252:1651 (1991); 
Adams et al. Nature 355:632 (1992)). Plasmid preparation is performed in a 96- 
well format for all stages of DNA preparation from bacterial growth through final 
DNA purification. Template concentration is determined using Hoechst Dye and a 
Millipore Cytofluor. DNA concentrations are not adjusted, but low-yielding 
templates are identified where possible and not sequenced. 

Templates are also prepared from two Streptococcus pneumoniae lambda 
genomic libraries. An amplified library is constructed in the vector Lambda GEM- 
12 (Promega) and an unamplified library is constructed in Lambda DASH II 
(Stratagene). In particular, for the unamplified lambda library, Streptococcus 
pneumoniae DNA (> 100 kb) is partially digested in a reaction mixture (200 ul) 
containing 50 fig DNA, IX Sau3AI buffer, 20 units Sau3AI for 6 min. at 23°C. 
The digested DNA was phenol-«xtracted and electrophoresed on a 0.5% low 
melting agarose gel at 2V/cm for 7 hours. Fragments from 15 to 25 kb are excised 
and recovered in a final volume of 6 ul. One ul of fragments is used with 1 ul of 
DASHII vector (Stratagene) in the recommended ligation reaction. One uJ of the 
ligation mixture is used per packaging reaction following the recommended 
protocol with the Gigapack II XL Packaging Extract (Stratagene, #22771 1). Phage 
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are plated directly without amplification from the packaging mixture (after dilution 
with 500 \il of recommended SM buffer and chloroform treatment). Yield is about 
2.5xl0 3 pfu/ul. The amplified library is prepared essentially as above except the 
lambda GEM-12 vector is used. After packaging, about 3.5xl0 4 pfu are plated on 
5 the restrictive NM539 host. The lysate is harvested in 2 ml of SM buffer and 
stored frozen in 7% dimethylsulfoxide. The phage titer is approximately lxlO 9 
pfu/ml. 

Liquid lysates (100 Jill) are prepared from randomly selected plaques (from 
the unamplified library) and template is prepared by long-range PCR using T7 and 

10 T3 vector-specific primers. 

Sequencing reactions are carried out on plasmid and/or PCR templates 
using the AB Catalyst LabStation with Applied Biosystems PRISM Ready 
Reaction Dye Primer Cycle Sequencing Kits for the M13 forward (M13-21) and 
the M13 reverse (M13RP1) primers (Adams et aL, Nature 368:474 (1994)). Dye 

15 terminator sequencing reactions are carried out on the lambda templates on a 
Perkin-Elmer 9600 Thermocycler using the Applied Biosystems Ready Reaction 
Dye Terminator Cycle Sequencing kits. T7 and SP6 primers are used to sequence 
the ends of the inserts from the Lambda GEM-12 library and T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library. 

20 Sequencing reactions are performed by eight individuals using an average of 
fourteen AB 373 DNA Sequencers per day. All sequencing reactions are analyzed 
using the Stretch modification of the AB 373, primarily using a 34 cm well-to-read 
distance. The overall sequencing success rate very approximately is about 85% for 
Ml 3-21 and M13RP1 sequences and 65% for dye-terminator reactions. The 

25 average usable read length is 485 bp for M13-21 sequences, 445bp for M13RP1 
sequences, and 375 bp for dye-terminator reactions. 

Richards et aL, Chapter 28 in AUTOMATED DNA SEQUENCING AND 
ANALYSIS, M. D. Adams, C. Fields, J. C. Venter, Eds., Academic Press, 
London, (1994) described the value of using sequence from both ends of 

30 sequencing templates to facilitate ordering of contigs in shotgun assembly projects 
of lambda and cosmid clones. We balance the desirability of both-end sequencing 
(including the reduced cost of lower total number of templates) against shorter 
read-lengths for sequencing reactions performed with the M13RP1 (reverse) primer 
compared to the Ml 3-21 (forward) primer. Approximately one-half of the 
35 templates are sequenced from both ends. Random reverse sequencing reactions are 
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done based on successful forward sequencing reactions. Some M13RP1 
sequences are obtained in a semi-directed fashion: M13-21: sequences pointing 
outward at the ends of contigs are chosen for M13RP1 sequencing in an effort to 
specifically order contigs. 

4. Protocol for Automated Cycle Sequencing 

The sequencing is carried out using ABI Catalyst robots and AB 373 
Automated DNA Sequencers. The Catalyst robot is a publicly available 
sophisticated pipetting and temperature control robot which has been developed 
specifically for DNA sequencing reactions. The Catalyst combines pre-aliquoted 
templates and reaction mixes consisting of deoxy- and dideoxynucleotides, the 
thermostable Taq DNA polymerase, fluorescently-labelled sequencing primers, and 
reaction buffer. Reaction mixes and templates are combined in the wells of an 
aluminum 96-well thermocycling plate. Thirty consecutive cycles of linear 
amplification (i.e.., one primer synthesis) steps are performed including 
denaturation, annealing of primer and template, and extension; i.e., DNA 
synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents 
evaporation without the need for an oil overlay. 

Two sequencing protocols are used: one for dye-labelled primers and a 
20 second for dye-labelled dideoxy chain terminators. The shotgun sequencing 
involves use of four dye-labelled sequencing primers, one for each of the four 
terminator nucleotide. Each dye-primer is labelled with a different fluorescent dye, 
permitting the four individual reactions to be combined into one lane of the 373 
DNA Sequencer for electrophoresis, detection, and base-calling. ABI currently 
supplies pre-mixed reaction mixes in bulk packages containing all the necessary 
non-template reagents for sequencing. Sequencing can be done with both plasmid 
and PCR- generated templates with both dye-primers and dye- terminators with 
approximately equal fidelity, although plasmid templates generally give longer 
usable sequences. 

Thirty-two reactions are loaded per AB373 Sequencer each day, for a total 
of 960 samples. Electrophoresis is run overnight following the manufacturer's 
protocols, and the data is collected for twelve hours. Following electrophoresis 
and fluorescence detection, the ABI 373 performs automatic lane tracking and base- 
calling. The lane-tracking is confirmed visually. Each sequence electropherogram 
(or fluorescence lane trace) is inspected visually and assessed for quality. Trailing 



25 



30 



35 



WO 98/18931 



47 



PCT/US97/19588 



sequences of low quality are removed and the sequence itself is loaded via software 
to a Sybase database (archived daily to 8mm tape). Leading vector polylinker 
sequence is removed automatically by a software program. Average edited lengths 
of sequences from the standard ABI 373 are around 400 bp and depend mostly on 
5 the quality of the template used for the sequencing reaction. ABI 373 Sequencers 
converted to Stretch Liners provide a longer electrophoresis path prior to 
fluorescence detection and increase the average number of usable bases to 500-600 
bp. 

10 INFORMATICS 

1. Data Management 

A number of information management systems for a large-scale sequencing 
lab have been developed. (For review see, for instance, Kerlavage et al. t 
Proceedings of the Twenty-Sixth Annual Hawaii International Conference on 

15 System Sciences, IEEE Computer Society Press, Washington D. C, 585 (1993)) 
The system used to collect and assemble the sequence data was developed using the 
Sybase relational database management system and was designed to automate data 
flow wherever possible and to reduce user error. The database stores and 
correlates all information collected during the entire operation from template 

20 preparation to final analysis of the genome. Because the raw output of the ABI 373 
Sequencers was based on a Macintosh platform and the data management system 
chosen was based on a Unix platform, it was necessary to design and implement a 
variety of multi- user, client-server applications which allow the raw data as well as 
analysis results to flow seamlessly into the database with a minimum of user effort. 

25 

2. Assembly 

An assembly engine (TIGR Assembler) developed for the rapid and 
accurate assembly of thousands of sequence fragments was employed to generate 
contigs. The TIGR assembler simultaneously clusters and assembles fragments of 

30 the genome. In order to obtain the speed necessary to assemble more than 10 4 
fragments, the algorithm builds a hash table of 12 bp oligonucleotide subsequences 
to generate a list of potential sequence fragment overlaps. The number of potential 
overlaps for each fragment determines which fragments are likely to fall into 
repetitive elements. Beginning with a single seed sequence fragment, TIGR 

35 Assembler extends the. current contig by attempting to add the best matching 



WO 98/18931 



PCT/US97/19588 



fragment based on oligonucleotide content. The contig and candidate fragment are 
aligned using a modified version of the Smith-Waterman algorithm which provides 
for optimal gapped alignments (Waterman, M. S., Methods in Enzymology 
164:165 (1988)). The contig is extended by the fragment only if strict criteria for 
5 the quality of the match are met. The match criteria include the minimum length of 
overlap, the maximum length of an unmatched end, and the minimum percentage 
match. These criteria are automatically lowered by the algorithm in regions of 
minimal coverage and raised in regions with a possible repetitive element. The 
number of potential overlaps for each fragment determines which fragments are 

io likely to fall into repetitive elements. Fragments representing the boundaries of 
repetitive elements and potentially chimeric fragments are often rejected based on 
partial mismatches at the ends of alignments and excluded from the current contig. 
TIGR Assembler is designed to take advantage of clone size information coupled 
with sequencing from both ends of each template. It enforces the constraint that 

15 sequence fragments from two ends of the same template point toward one another 
in the contig and are located within a certain range of base pairs (definable for each 
clone based on the known clone size range for a given library). 

The process resulted in 391 contigs as represented by SEQ ID NOs: 1-391. 

20 3. Identifying Genes 

The predicted coding regions of the Streptococcus pneumoniae genome 
were initially defined with the program GeneMark, which finds ORFs using a 
probabilistic classification technique. The predicted coding region sequences were 
used in searches against a database of all nucleotide sequences from GenBank 

25 (October, 1997), using the BLASTN search method to identify overlaps of 50 or 
more nucleotides with at least a 95% identity. Those ORFs with nucleotide 
sequence matches are shown in Table 1. The ORFs without such matches were 
translated to protein sequences and compared to a non-redundant database of 
known proteins generated by combining the Swiss-prot, PIR and GenPept 

10 databases. ORFs that matched a database protein with BLASTP probability less 
than or equal to 0.01 are shown in Table 2. The table also lists assigned functions 
based on the closest match in the databases. ORFs that did not match protein or 
nucleotide sequences in the databases at these levels are shown in Table 3. 
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ILLUSTRATIVE APPLICATIONS 

L Production of an Antibody to a Streptococcus pneumoniae 
Protein 

Substantially pure protein or polypeptide is isolated from the transfected or 
5 transformed cells using any one of the methods known in the art. The protein can 
also be produced in a recombinant prokaryptic expression system, such as £. coli, 
or can be chemically synthesized. Concentration of protein in the final preparation 
is adjusted, for example, by concentration on an Amicon filter device, to the level 
of a few micrograms/ml. Monoclonal or polyclonal antibody to the protein can 
1 0 then be prepared as follows. 

2. Monoclonal Antibody Production by Hybridoma Fusion 

Monoclonal antibody to epitopes of any of the peptides identified and 
isolated as described can be prepared from murine hybridomas according to the 

15 classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or 
modifications of the methods thereof. Briefly, a mouse is repetitively inoculated 
with a few micrograms of the selected protein over a period of a few weeks. The 
mouse is then sacrificed, and the antibody producing cells of the spleen isolated. 
The spleen cells are fused by means of polyethylene glycol with mouse myeloma 

20 cells, and the excess unfused cells destroyed by growth of the system on selective 
media comprising aminopterin (HAT media). The successfully fused cells are 
diluted and aliquots of the dilution placed in wells of a microtiter plate where 
growth of the culture is continued. Antibody-producing clones are identified by 
detection of antibody in the supernatant fluid of the wells by immunoassay 

25 procedures, such as ELISA, as originally described by Engvall, E., Meth. 
EnzymoL 70:419 (1980), and modified methods thereof. Selected positive clones 
can be expanded and their monoclonal antibody product harvested for use. Detailed 
procedures for monoclonal antibody production are described in Davis, L. et a/., 
Basic Methods in Molecular Biology, Elsevier, New York. Section 21-2 (1989). 

30 
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3. Polyclonal Antibody Production by Immunization 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a 
single protein can be prepared by immunizing suitable animals with the expressed 
protein described above, which can be unmodified or modified to enhance 
immunogenicity. Effective polyclonal antibody production is affected by many 
factors related both to the antigen and the host species. For example, small 
molecules tend to be less immunogenic than others and may require the use of 
carriers and adjuvant. Also, host animals vary in response to site of inoculations 
and dose, with both inadequate or excessive doses of antigen resulting in low titer 
antisera. Small doses (ng level) of antigen administered at multiple intradermal 
sites appears to be most reliable. An effective immunization protocol for rabbits 
can be found in Vaitukaitis, J. et al, J. Clin. Endocrinol. Metab. 33-988-991 
(1971). 

Booster injections can be given at regular intervals, and antiserum harvested 
when antibody titer thereof, as determined semi-quantitatively, for example, by 
double immunodiffusion in agar against known concentrations of the antigen, 
begins to fall. See, for example, Ouchterlony, O. etal, Chap. 19 in: Handbook of 
Experimental Immunology, Wier, D.. ed, Blackwell (1973). Plateau concentration 
of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12M). 
Affinity of the antisera for the antigen is determined by preparing competitive 
binding curves, as described, for example, by Fisher, D., Chap. 42 in: Manual of 
Clinical Immunology, second edition, Rose and Friedman, eds., Amer. Soc. For 
Microbiology, Washington, D. C. ( 1 980) 

Antibody preparations prepared according to either protocol are useful in 
quantitative immunoassays which determine concentrations of antigen-bearing 
substances in biological samples; they are also used semi- quantitatively or 
qualitatively to identify the presence of antigen in a biological sample. In addition, 
antibodies are useful in various animal models of pneumococcal disease as a means 
of evaluating the protein used to make the antibody as a potential vaccine target or 
as a means of evaluating the antibody as a potential immunotherapeutic or 
immunoprophylactic reagent. 
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4. Preparation of PCR Primers and Amplification of DNA 

Various fragments of the Streptococcus pneumoniae genome, such as those 
of Tables 1-3 and SEQ ID NOS:l-391 can be used, in accordance with the present 
invention, to prepare PCR primers for a variety of uses. The PCR primers are 
preferably at least 15 bases, and more preferably at least 18 bases in length. When 
selecting a primer sequence, it is preferred that the primer pairs have approximately 
the same G/C ratio, so that melting temperatures are approximately the same. The 
PCR primers and amplified DNA of this Example find use in the Examples that 
follow. 

5. Gene expression from DNA Sequences Corresponding to 

ORFs 

A fragment of the Streptococcus pneumoniae genome provided in Tables 1- 
3 is introduced into an expression vector using conventional technology. 
Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well 
known in the art. Commercially available vectors and expression systems are 
available from a variety of suppliers including Stratagene (La Jolla, California), 
Promega (Madison, Wisconsin), and Invitrogen (San Diego, California) If 
desired, to enhance expression and facilitate proper protein folding, the codon 
context and codon pairing of the sequence may be optimized for the particular 
expression organism, as explained by Hatfield et aL, U. S. Patent No. 5,082,767, 
incorporated herein by this reference. 
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10 



15 



The following is provided, as one exemplary method to generate 
polypeptide(s) from cloned ORFs of the Streptococcus pneumoniae genome 
fragment. Bacterial ORFs generally lack a poly A addition signal. The addition 
signal sequence can be added to the construct by, for example, splicing out the poly 
A addition sequence from pSG5 (Stratagene) using Bgll and Sail restriction 
endonuclease enzymes and incorporating it into the mammalian expression vector 
pXTl (Stratagene) for use in eukaryotic expression systems. pXTl contains the 
LTRs and a portion of the gag gene of Moloney Murine Leukemia Virus. The 
positions of the LTRs in the construct allow efficient stable transfection. The 
vector includes the Herpes Simplex thymidine kinase promoter and the selectable 
neomycin gene. The Streptococcus pneumoniae DNA is obtained by PCR from the 
bacterial vector using oligonucleotide primers complementary to the Streptococcus 
pneumoniae DNA and containing restriction endonuclease sequences for PstI 
incorporated into the 5' primer and Bglll at the 5* end of the corresponding 
Streptococcus pneumoniae DNA 3' primer, taking care to ensure that the 
Streptococcus pneumoniae DNA is positioned such that its followed with the poly 
A addition sequence. The purified fragment obtained from the resulting PCR 
reaction is digested with PstI, blunt ended with an exonuclease, digested with 
Bglll, purified and ligated to pXTl, now containing a poly A addition sequence 
20 and digested Bglll. 

The ligated product is transfected into mouse NIH 3T3 cells using 
Lipofectin (Life Technologies, Inc., Grand Island, New York) under conditions 
outlined in the product specification. Positive transfectants are selected after 
growing the transfected cells in 600 ug/ml G418 (Sigma, St. Louis, Missouri). 
The protein is preferably released into the supernatant. However if the protein has 
membrane binding domains, the protein may additionally be retained within the cell 
or expression may be restricted to the cell surface. Since it may be necessary to 
purify and locate the transfected product, synthetic 15-mer peptides synthesized 
from the predicted Streptococcus pneumoniae DNA sequence are injected into mice 
to generate antibody to the polypeptide encoded by the Streptococcus pneumoniae 
DNA. 



25 



30 



WO 98/18931 



53 



PCT/US97/19588 



Alternatively and if antibody production is not possible, the Streptococcus 
pneumoniae DNA sequence is additionally incorporated into eukaryotic expression 
vectors and expressed as, for example, a globin fusion. Antibody to the globin 
moiety then is used to purify the chimeric protein. Corresponding protease 
5 cleavage sites are engineered between the globin moiety and the polypeptide 
encoded by the Streptococcus pneumoniae DNA so that the latter may be freed 
from the formed by simple protease digestion. One useful expression vector for 
generating globin chimerics is pSG5 (Stratagene). This vector encodes a rabbit 
globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 

10 transcript, and the polyadenylation signal incorporated into the construct increases 
the level of expression. These techniques are well known to those skilled in the art 
of molecular biology. Standard methods are published in methods texts such as 
Davis et aU cited elsewhere herein, and many of the methods are available from the 
technical assistance representatives from Stratagene, Life Technologies, Inc., or 

15 Promega. Polypeptides of the invention also may be produced using in vitro 
translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes 
of clarity and understanding, one skilled in the art will appreciate that various 
changes in form and detail can be made without departing from the true scope of 

20 the invention. 

All patents, patent applications and publications referred to above are 
hereby incorporated by reference. 
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(1) GENERAL INFORMATION: 

(i) APPLICANT: Charles Kunsch 

Gil H. Choi 
Patrick S. Dillon 
Craig A. Rosen 
Steven C. Barash 
Michael R. Fannon 
Brian A. Dougherty 

(ii) TITLE OF INVENTION: Streptococcus pneumoniae Polynucleotides and Sequences 

(iii) NUMBER OF SEQUENCES: 391 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: USA 

(F) ZIP: 20850 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 



(vi) CURRENT APPLICATION DATA: 
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^ (A) APPLICATION NUMBER: 

(B) FILING DATE; 

(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Brookes, A. Anders 

(B) REGISTRATION NUMBER: 36,373 

(C) REFERENCE/ DOCKET NUMBER: PB340P1 

(vi) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (301) 309-8504 

(B) TELEFAX: (301) 309-8512 
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(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5625 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
CCAAGCAAAA CCAGCTACAG CTAAAGGAAC TTACGTAACA AACTTGACTA TCACAACTAC 
TCAAGGTGTT GGTATCAAAG TTGACGTAAA CTCACTTTAA TCAGTAGTTA AAGTAATGTA 
AAAAAGTTGA AGACGCTATG TCTCAACTTT TTTTGATGTA CGACGGGCAT GTTGTATAGT 
AGATGTGTAC TATTCTAGTT TCAATCTACT ATAGTAGCTC AGAAGTCGGT ACTTAAACGT 
GCTATATCAA AACCAGTCCT TGAAAAACGT GGACTGGTTT CGTGTTTGGA TTATTACCTT 
GAACGACATG CGTTAAAAGT TAGTTGAACC GCCGTATGCC GAACGGACGT ACGGTGGTGT 
GAGAGGGGCT AGAGATTATC CCCTACTCGA TTTCGAAATC TAGTGGAATG AATCTGGAAT 
AGTCCATCGA GCTTTCTAAT ACTCTTCGAA AATCTCTTCA AACCACGTCA ACGTCGCCTT 
GCCGTGCGTA TGGTTACTGA CTTCGTCAGT TCTATCCACA ACCTCAAAAC AGTGTTTTGA 
GCTGACTACG TCAGTTCCAT CTACAACCTC AAAACAGTGT TTTGAGCAAC CTGCGGCTAG 
TTTCCTAGTT TGCTCTTTGG TTTTCATTGA GTATAACACA TTGTTAGAAG TTGGTTTAAA 
TTTCCTAATC AGTTTGTTCA CATTTACCTT CGATATATTA TATCCCATAG TTAAGGTTGG 
TCATACAGAT GATTATAGTC ATGGAGCCGT AAAACTTAGT GTTTCTTTAG TTGACAAAGA 780 
TGCCATGAAA AAAATATTTG TAACTGTAAT AGGATATTTT GAAATAAATA TAGATGAAAA 
TATCACCGAT ATTCTATACG TAAATGGTAC TGCTATTCTT TATCTTTATT TACGTTCAAT 
TGTTTCAATA GTTTCGGCAA TTGATAGCAG TGAAGCAATG TTGCTACCTA TCATTAATGT 
TTTAGAGTTA CTAGATAAAT CTCAACCTTT TGAAGAAGAA TAATTTATTA GCTCACTAAA 
TTGAGGGTAA GGAAAAGTAA AAGCAGTAAG AAAAATGTCT TGCATTATAC AGCAACCTTT 
TGGGAATGAG TGGATGGATT GAATAAAATT TGATTAAGAG TGGATGATTT ATCTGTAGAT 
TATTATTGGA CAGTTAGTCT TGAAGTAGTC TAAGAATTAG GTTATAATCA GTAGAAGCCT 
TGCTAATAAT GAGGAGGTTA GTTTATGTAT AGTAGACTGA ATCTAAAATA GTACGAAACA 
ATTGCTAAAA CATTTATAGA AATTAATTTT ACTTTCCCAA TCGATTTGTT CTCATCTTAT 
TTCAATCCGC TATATATTAT GGTATCGAAT CTTCATCAGA ATGATAAAAT TAATCAATTG 
ATATCTGATT ACAAACAGAA TATGAAAGCT TTTTATATCA CTATTGAAAA ATTTATACGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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GATGATGAAA 


GCCTTAAGTG 


TTATTTTATA AAGGTTATTT 


CAAGTCGTTC 


CAAGGTAACA 


1500 


AGTCTAGATC 


AGATTGAAGC 


TGATAAAACG 


ATACAAAGAA 


AATATTCAAG 


TGAGCTAAAA 


1560 


AAATTTATTG 


GATTTTATAA 


TGAGATTATT 


TGTGAGGAAA 


ATAGTTTCCT 


ACATGTACGA 


1620 


AAGAGGTGGT 


CGAGTTGGTT 


TAGGTAGTCG 


ATGCGTGAGT 


TGATAATTCT 


CAGGGTATGG 


1680 


ACTTCTTTTT 


CATGAATGAG 


GTAAAAGAGC 


AGGTATTGTT 


TAGAGACAAT 


CATTCTGAGC 


1740 


ATATTTTCTG 


GATAGAGGGA 


GTATCCGATT 


TTATGATCAA 


AGTTAATACC 


GCCCTCTGGT 


1800 


GAGAAGATGA 


GTAGGTTGGT 


AATTTAAACT 


ATTAAACAGA 


ATTTTTGATT 


AAAAGTATTA 


1860 


TTTCATGAGA 


GAAATCCTAA 


TTTCACAATC 


CATAGGCAAA 


CGCTTGCATT 


TCGTTTTTTA 


1920 


TTGGACTATA 


ATAGGTTGGT 


ATAAAGCCTT 


CTGTAGTAAT 


AAAATGTAGA 


AGGTGTAGAA 


1980 


AGTAAGGATT 


TAGAATATTT 


GTAGTTAAAA ACACAATGTT 


GCTATTCCTT 


ACGATAGGGA 


2040 


GATAGATATG 


GCAATGATAG 


AAGTGGAACA 


TCTTCAGAAA 


AATTTTGTGA 


AGACTGTTAA 


2100 


GGAACCGGGC 


TTGAAGGGGG 


CTTTGCGCTC 


CTTTATTCAT 


CCTGAAAAGC 


AGACCTTTGA 


2160 


AGCGGTCAAG 


GATTTGACCT 


TTGAGGTTCC 


AAAAGGGCAG 


ATTTTAGGAT 


TTATCGGGGC 


2220 


AAATGGTGCT 


GGGAAGTCGA 


CAACCATTAA 


AATGCTGACA 


GGAATTTTGA AACCAACATC 


2280 


TGGTTTTTGT 


CGGATTAACG 


GCAAGATTCC 


CCAGGACAAT 


CGGCAAGATT 


ATGTCAAAGA 


2340 


TATTGGCGTA 


GTCTTTGGAC 


AACGCACCCA GCTATGGTGG 


GATTTGGCTC 


TGCAAGAGAC 


2400 


CTACACTGTC 


TTAAAAGAGA 


TTTATGATGT GCCAGACTCG 


CTCTTTCATA 


AGCGTATGGA 


2460 


CTTTTTGAAT 


GAAGTCTTGG 


ATTTGAAGGA CTTTATCAAG 


GATCCCGTGC GGACTCTTTC 


2520 


ACTGGGACAA 


CGGATGCGGG 


CGGATATTGC 


GGCCTCCTTG 


CTCCACAATC 


CCAAGGTTCT 


2580 


TTTTTTAGAT 


GAGCCGACCA 


TTGGTTTGGA 


CGTTTCGGTT 


AAGGATAATA 


TTCGTGGGGG 


2640 


AATTACTCAG 


ATCAATCAAG 


AGGAAGAAAC 


TACCATTCTT 


TTGACCACTC 


ACGATTTGAG 


2700 


TGATATTGAG 


CAACTTTGTG 


ATCGGATTTT 


CATGATTGAC 


AAGGGGCAAG 


AGATTTTTGA 


2760 


TGGAACGGTG 


AGCCAACTCA 


AGGAGACCTT 


TGGTAAGATG 


AAGACTCTCT 


CTTTTGAACT 


2820 


GCTACCAGGT 


CAAAGTCATC 


TCGTCTCTCA 


CTATGACGGT 


CTGTCTGATA 


TGACCATTGA 


2880 


TAGACAAGGA 


AACAGCCTCA 


ACATTGAATT 


TGATAGTTCT 


CGCTACCAGT 


CAGCTGACAT 


2940 


TATCAAGCAA 


ACCCTGTCTG 


ATTTTGAAAT 


CCGCGATTTG 


AAGATGGTGG 


ATACGGATAT 


3000 


TGAGGATATT 


ATCCGTCGCT 


TCTACCGAAA 


GGAGCTCTAG 


GATGATGAAA 


TTGTGGAGAC 


3060 


GTTATAAACC 


CTTTATCAAT 


GCAGGGGTTC 


AGGAGTTGAT 


TACTTACCGA 


GTCAACTTTA 


3120 


TTCTCTATCG 


GATTGGCGAT 


GTCATGGGGG 


CTTTTGTGGC 


CTTTTATCTC 


TGGAAGGCTG 


3180 
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TCTTTGATTC 
ACATCATCAT 
GGGAGQAGGT 
CCTCCTATCT 
CATTTTTAAG 
TAGGATTAAC 
TTAATATTTG 
TTAAGACTTC 
AGGTTGTTTC 
TGATCATTGT 
TCTGGCTCTT 
TCACCATTCA 
CAATACATCA 
TTTCTGACTC 
CTAGAAGGCT 
GGAATGGACC 
GGGGAGTTTG 
ACCTTTCAGA 
GTGACCAGCA 
GCGACCTTGA 
CAGTCAGGCG 
TCTATTTACA 
GCCTACTATC 
TTGATGTTGA 
GATTCCTACG 
TTATGATGTT 
AGGTTTTGAA 
AAGGAACCAC 
TAAGCTTATC 
TTATCAAGGT 



TTCGCAAGAG 
GAGTTTTGTG 
CAAGGATGGC 
TTTCACCGAG 
TGTCATTGTC 
TGTCATTTAT 
CTTTGGATTT 
CATAGTGGCT 
AGATATTCTC 
TGGAAAATAC 
AGTGATGGTG 
AGGAGGTTAG 
AACAAATCAT 
AAGGCTTGAA 
GGACCTTTCA 
ATCTCTTTTT 
ACAAGTATCT 
TTGATGCCTT 
TTGTTTGGAC 
TTTATACTTC 
CCATGATTTA 
ATTCTCTTCT 
CAGCTAGCTA 
TTTCTCTGGT 
AAAGTGCGGG 
TGTAATTGAA 
GGATTTGCCA 
GACACTGCAA 
CTATTCGAGT 
AGAAAAATTG 



TCTTTGATTC 
ACCAATCTTC 
TCCATTATCA 
CTTGGTTCCA 
TTGATGAAAA 
CTTTTTAGCT 
TCAGCCTTTG 
TTTATGTCGG 
TCCTTTTTGC 
GATGCCAGTC 
GGATTGTCTC 
TATGAAAAAA 
GGAATATAAG 
TCTCTTGTTT 
AGAGATAGCT 
TGACAATCTC 
GACTCGTCCC 
GGGTGAACTC 
TCTTCCAAAA 
TCTTAAAATC 
CATCTTCTAT 
TCGTTGGTTG 
TTTCTTACAG 
TTTCTTTGTT 
TTCGTAAAAG 
GAAGTCAAGG 
GAATGGTTTG 
GTTTGGACCG 
GAAGATTGTG 
GGAGCCAATT 



152 
AGGGCTTCAG 



TGACTAGATC 
TGCGTTTGTT 
AGTGGTTGAT 
TCATATCGGG 
TAACGCTCGC 
TGTTTAAAAA 
GGAGTTTGAT 
CTTTTTCATC 
AGATTCTTCA 
AGTTAATTTG 
TATCAACGAA 
GTAGATTTTG 
CTCAATGTCA 
TTCATTTATG 
TGGGCACTAG 
ATCAATCCTC 
TTAGTCGGTG 
TTCCTGCTTT 
GCAACAGCCA 
ATGTTCAATG 
ATTAGCTTTA 
GAAAAGGATG 
ATTTCCCTTA 
CTAAAGTAAG 
ATGAAAATCA 
GAATCCCAGA 
CCTATCAGGA 
CAGAGATTGA 
GCTTGCTACT 



TATGGCGGAT 
CGATTCGTCC 
GCGACCAGTG 
TTTTATCAGC 
TCAAGGTATT 
CTATCTGATT 
TCTTTGGGGT 
TCCCTTGGCA 
CTTGATTTAT 
GGCACTCCTT 
GAAACGGGTC 
TGCATCTGAT 
TGGTTGGTGT 
TCTTTCAACA 
GATTTTCCTT 
GGCAACGCCT 
TCTTTCACAT 
GTATTTTATT 
TCCTAGTTTG 
GTATCGCCTT 
ACTTTGCTAA 
TCGTGCCTTT 
TGTTCTTTAA 
AACTTTGGGA 
ACTAAAATCA 
AAAAAAGGCA 
AAGCACACAA 
GAGTGATTTG 
TTGTCTCGGC 
TTAGAGAGTG 



ATCACCCTCT 
TTTATGATTG 
CATTTTGCGG 
GTTGGCCTTC 
GTAGAGGTGC 
AACTTTTTCT 
TCCAACCTAC 
TTTTTTCCAA 
ACTCCAGTTA 
TTGCAGTTCT 
CAGTCCTTTA 
TTTTATCAGA 
CTTGGGAGTC 
TATTCCATTC 
GATTCCCAAG 
AGTCCGAAAA 
CCTAGTTGAA 
GGGAACAACA 
TATTCCTTTT 
TTGGACTAAG 
GTATCCGATT 
CGCCTTTACA 
CGTAGGAGGT 
TAAGGGCTTA 
AGAAAGAAAC 
GTTGTCGCTG 
GCCTATATAG 
ACTAGATTTG 
GTAAAAAAGC 
AAGCTCGTAA 



3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 
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AAAAGTTGGT TATCTGCAGG TCAAAACAGT GGCAGAAGGT TCTAATAAAG ATTATGATCG 5040 

AACAAATGAC TTTTATCGAG GTCTTGGCTT TAAAAAGTTA GAGATTTTTC CTCAACTATG 5100 

GAATCCGCAA AATCCTTGTC AGATTTTGAT TAAAAAGCTT GAATAATATT ACTTGACATC 5160 

TATTCTCAGA GTGCTATACT GTAAGTGTAA TCGCCGATTT AGCTTAGTTG GTAGAGCAAG 5220 

GCACTCGTAA AGCCTAGGTT ATAGGTAGAT AAACGACTGA GGATTTGAAA AAATAGATAG 5280 

GTAGAAGATA ACCGTTAAGC CTTACTCTTA GCGGTTATTT ATATTGTTTA ATAGCGCTAA 5340 

TATTTTATCA ATTATGCCTG TTTTCGTGTT TCTGGTAGTT GTTCAAGTTT ATTGCTACTA 5400 

TTTTTGATGG TATGAATGTG CTTATAATGT ATCCCGGTTA ACGAAAGTTT TGGACTTATA 5460 

CTCTTCGAAA ATCTCTTCAA ACCACGTCAA CGTCGCCTTG GCGTGCGTAT GGTTATGACT 5520 

TCGTCAGTTC TATCCACAAC CTCAAAACAG TGTTTTGAGT GACTACGTCA GTTCCATCTA 5580 

CAACCTCAAA ACACTGTTTT GCCCAATCTG CGGCTAGTTT CCTAG 5625 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7571 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

CTCTCCAGCT TTCCTTGCGA GTTGGCCATG TTGTGTCTTT AAGAAGTCTA AAAATATCTC 60 

CAATAAAACG CATCGCTCTC TCCTATCTCG TTTCTCTGTG TGTAGTGTAC TTGCCACAAT 120 

GCTTACAAAA TTTATTTACT TCTAGTCGTG TAGGCTTGAG GTTTCCGCTG ATCTTGATTG 180 

AATAGTTTCT CGAACCACAA ACCGCACAAG CTAGGCTTGC TTTTTTTAGT GCCATAACGC 240 

CTCCATCTTA TCCATTATAA CAAGAAAGCT AGGCTTTGAC AAGCATCTTA GCGAAATAGA 300 

TTGACTATCG AATCCCATAT TGTTTGAGCC TTTTCCTTAA TCTTCGCATC TGAGATAGCC 360 

CGGCTAGCCT CATCTACTAG ACTTTGCGCA CGCCCTCGAA TATCAGACAA ATTATCATCT 420 

GTCTGGCTAT TATCATTGGT TTGTACTTGT CTTTTTGTAT TGGCTGGTGC AATTCCATTT 480 

TGCTTATAAG CATTTTCAAC CGTAAAGGTA CTTCCTGGCG TATAAGGTAA AATGGTATTG 540 

GCAATGTTTC TAAAGACATG AGCTGCACCG TTTGAAGTAG AGCCAGCTAG ATAGTGGTTT 600 

TCATCAGTGG TCGGAAAGCC AAGCCAGTGG CTAATCACTA CATCCGGAGT ATAACCAATT 660 

ACCCACTGGT CACTTGTGTA CTCCGGATTG AAAACTGCTT CAGTTGTTCC AGTTTTCCCT 720 
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GCCATGACAT AGTCTGCAGG CGATGAACTA ATACCGGTAC CGTTGGTGAA AGTCCCCAAC 780 

ATCATACTGG TCATCTTGTC AGCTACAGAC TTATCAATCA CCCGTTTTTG TGAATTTTTA 840 

TGACTCGCAA TAACTTGTCC ACTAGCATTT TCAATTCTAC TAATAAAATG AGCTTCAGGC 900 

ATTAAACCTT CATTTGCAAA GGCGGCGTAT GCTTGAGCCA TTTGAAGAGG GTTGGTTTCA 960 

ACACCGCTTC CCAAGGCGAC ACCAAGAACA CGGTCGACCT TTTCCATGTT GAGTCCGAAT 1020 

TTTTCGCCTG CCTCAAAAGC CTTGTCGACA CCCAAATCAT TAACAGTGGC AACAGCAGGT 1080 

AGATTAAGCG ATTCTGCCAA GGCTTGATAC ATAGGAACTT CTCGACTCGT TTTGATCCCT 1140 

GCATAGTTAT CAACCTTATA GCTGTCATAC TGCATGGTAT GGTTATCCAA CTGCTTATTC 1200 

AAAGCCCAGC TTGCTTCAAC TGCTGGCGTA TAAACAACTA AAGGCTTAAT TGTAGAACCA 1260 

GGACTACGCT TTGATTGGGT TGCATAGTTG AAATTCCGGA ATCCAGTTTT ATCATTGTCA 1320 

GCAACTTGAC CGACAACTCC ACGAACTCCC CCTGTTTTCG GTTCGAGGGC TACACTTCCT 1380 

GATTGAGCAA ACGTTCCATC CTCTGCCCTC GGAAATAGCG ATGTGTTTTC ATAAACAATC 1440 

TGCATATTTG C1TGGTAGTT TTGGTCCAGC TCTGTGTAAA TGCGGTAGCC ATTATTGACA 1500 

ATCTCTTCCT CTGTTAGATT ATACTTGGAA ACAGCTTCAT TAACCACCGC ATCAAAATAA 1560 

GAGGGGTAAC GGTAATCTGA GATTTTTCCT TCATACTTAT CGTGCAATTG CGAAGTCATA 1620 

TCAACTTCAG CAGCTTTGGT TTCTTGGTTT TTATCAATAT ATCCTGCTGC AACCATATTC 1680 

TGCAAGACAG TATCGCGCCG ATTAGTAGAA TCTTCTACGG AATTCAAGGG ATTATACAGT 1740 

TCCGGCCCCT TGAGCATCCC TGCCAGAGTC GCAGCTTGAT CCAGACTCAC TTCTGATGCA 1800 

GAAACTCCAA AGTATTTCTT ACTCGCATCT TCTACAGCCG ACACACCATT TCCAAAATAA I860 

GCGTTGTTAA GGTACATGGT TAGAATTTGC TCCTTACTAT ATTTTTTGCT TAATTCTAAG 1920 

GCAAGGAAAA ATTCTTTCGC TTTTCTCTCA ACAGTTTGAT CCTGCGATAA ATAGGCGTTT 1980 

TTAGCCAGCT GTTGGGTAAT GGTAGAGCCA CCACCTGAAC GTCCAGCAGT GACAATAGCC 2040 

AAGAAAAAAC GGCCATAGTT AATCCCGTCA TTTTTATAGA AAGAACGGTC TTCTGTCGCA 2100 

ATAACAGCAT TCTGCAAGTT TTTACTGATG TCAGTCAGCT CAACATAGGT TCCCTTTTGA 2160 

CCAGACAAGG CACCAGCCTC TTTTTCTTCA CGGTCAAAAA TAAGAGTCCG AGTTTTCAAG 2220 

GCATTTTGCA AATCATTGAC ATTGGTCGAC TTGGCTACAG CAAACAAATA GATTCCAACT 2280 

AGCAAGCCTG CACTCAAACC TAGTATAAGG ATAATCTTTG TTAGATGATA ACGACGCCAG 2340 

AATTTTCGAA TCGGACCTAC TTGGGCTAAT TTTTTTCGAT CACTACGAGA GCGACGTAAG 2400 

ATAGTAGAAT CAGAGTCCTC TAGTTCACTT GTTTCTTTTT TAAAAAGAGA AAGAAATTTC 2460 

TCAAATAATT TATCTAATTT CATGCGTTTA TTTTATCATC TTCATCATAG GAAGACAAGA 2520 
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ATTTAGCTAT 


TTCCTATCCA AATAGGGCTT TTTTTGTTAC AATATCTGTA 


TGCAATTCAC 


2560 


ATTTACATTA 


CCCGCCTCTC TACCTCAAAT GACAGTAAAG CAATTACTTG 


AGGAACAACT 


2640 


CCTCATCCCT 


AGAAAAATCC GTCATTTTTT GAGAATCAAG AAACATATTT 


TGATAAATCA 


2700 


AGAAGAAGTC 


CACTGGAAGG AAATCGTAAA TCCTGGAGAT GTTTGCCAGT 


TGACTTTTGA 


2760 


CGAGGAAGAT 


TATTCCCAAA AGACGATCCC TTGGGGCAAC CCAGACTTAG 


TGCAGGAAGT 


2820 


TTATCAAGAT 


CAACACTTGA TTATTGTAAA CAAACCAGAG GGGATGAAAA 


CGCATGGTAA 


2880 


TCAACCAAAC GAAATTGCCC TTCTTAACCA TGTCAGTACC TATGTTGGCC 


AAACCTGCTA 


2940 


TGTCGTTCAT CGTCTGGACA TGGAAACCAG TGGCTTAGTT CTCTTTGCCA 


AAAATCCTTT 


3000 


TATCCTGCCC 


ATTCTCAATC GCTTATTGGA GAAAAAAGAG ATTTCTAGAG AATATTGGGC 


3060 


TCTAGTTGAT GGAAATATCA ACAGAAAAGA ACTTGTTTTC AGAGACAAAA 


TTGGACGTGA 


3120 


TCGCCATGAT 


CGTAGAAAAA GAATAGTTGA TGCAAAAAAT GGGCAATATG 


CTGAAACGCA 


3180 


TGTAAGGAGA 


TTAAAGCAAT TCTCAAACAA GACTTCCTTG GCTCATTGCA 


AGCTAAAGAC 


3240 


AGGGCGAACC 


CATCAGATTC GTGTGCACCT TTCGCATCAT AATCTTCCTA 


TCCTGGGAGA 


3300 


CCCTCTCTAT 


AATAGTAAAT CAAAGACAAG CCGGCTTATG CTTCATGCCT 


TCCGACTTTC 


3360 


CTTTACCCAC 


CCACTTACTT TAGAGAAGCT AACTTTCACT ACGCTTTCAA 


ATACATTTGA 


3420 


AAAAGAATTA 


AAAAAGAATG GATGATCGTG TCATCCATTT TTCCATATAA 


AAAAGCAAGA 


3480 


CCACAAAGCC 


TTGCTTTCTA TCAACTCAAG AATTATTTAG CAATTTTTGC 


GAAGTATTCA 


3540 


AGAGTACGAA 


CAAGTTGTGC AGTGTATGAC ATTTCGTTGT CGTACCATGA 


TACAACTTTA 


3600 


ACCAATTGTT 


TACCGTCAAC GTCAAGAACT TTAGTTTGAG TTGCGTCAAA 


CAATGAACCG 


3660 


TAAGACATAC 


CTACGATATC TGAAGATACG ATTGGATCTT GTGTGTAACC 


GTATGATTGG 


3720 


TTTGAAGCTG CTTTCATAGC TGCGTTCACT TCATCAACAG TAACGTTCTT 


TTCAAGAACT 


3780 


GCTACCAATT 


CAGTAACTGA TCCAGTTGGA GTTGGAACGC GTTGTGCAGA 


TCCGTCAAGT 


3840 


TTACCATTCA 


ATTCTGGGAT TACAAGACCG ATAGCTTTTG CAGCACCAGT 


TGAGTTAGGA 


3900 


ACGATGTTTG 


CAGCACCAGC GCGAGCACGG CGAAGGTCAC CACCACGGTG 


TGGTCCGTCA 


3960 


AGGATCATTT 


GGTCACCAGT GTAAGCGTGG ATAGTAGTCA TCAATCCTTC 


AACAACACCA 


4020 


AAGTTGTCTT GAAGAGCTTT AGCCATTGGA GCCAAGCAGT TTGTAGTACA TGAAGCACCT 


4080 


GAGATAACTG TTTCAGTACC GTCAAGAACG TCGTGGTTAG TGTTGAATAC AACTGTTTTA 


4140 


ACGTCGTTTC 


CACCAGGAGC AGTGATAACA ACTTTTTTAG CTCCACCTTT 


AAGGTGTTTT 


4200 


TCAGCTGCTT 


CTTTCTTAGC AAAGAAACCA GTAGCTTCAA GAACGATTTC 


TACACCGTCA 


4260 
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GTAGCCCAGT CGATTTGTTC TGGATCACGT TCAGCAGAAA CTTTGATGAA TTTACCGTTA 4320 
ACTTCAAATC CACCTTCTTT AACTTCAACA GTACCGTCGA AACGACCTTG AGTTGTGTCG 4380 
TATTTCAACA AGTGTGCAAG CATAACTGGA TCTGTAAGGT CGTTGATGCG TGTAACTTCA 4440 
ACACCTTCTA CGTTTTGGAT ACGACGGAAA GCAAGACGAC CGATACGTCC GAAACCGTTA 4500 

ATACCAACTT TAACTACCAT TAGTGATTTC CTCCTTATGA AAATCATGAA ATTTTTATTG 4560 

TGAAAAGAGT AACTTGAATC ACTACAAATC ACCTTTCAAC AAACCTATTA TACAACTATT 4620 

TGAGTTGAAT TGCAAGTATG GCCATTGTTT TTCTATGTTA GTTTCTTTTT AAGACTGTAA 4680 

ACCAAGGAAT CCCTTACTAT TCATAGCATA ACGATTCTAT AGGATCCATT TTACTAATCT 4740 

TACGCGCCGG GAAGTAGGCT GAGACATAAC CAAGTAATAG AGCGAAAACT AGAGTTCCTA 4800 

AAACAGATAA AAGATTTAAT TTAAAAACCT TAGTGATGGA TGGGTAAAAG TGACTTACAA 4860 

TCGCATTCGC CAAACTTCCC ACCCCTTGTG CAACCAAAAA TGCCAGGAGC AAGGCGATGC 4920 

CTACAATCCA GATAGCCTCG TAAATAAAAA TTCCTTTGAC ATCACGATTC TGATAACCAA 4980 

CTGCTTTCAT GACACCTATT TCCTTGGAAC GTTGCATGAT ATTGATGTAA ATAATGATAC 5040 
CAATCATAAC CGCTGCTACC ACAATAGCTT GTGATGAAAG CACAATCAAT AATCCCTGAA - 5100 

TAACACGAAT AAAGGTAATC ACAATATCAA GAACTCTCTG TTGAGAAAGC ACAGTATACT 5160 

TCTTATTTTT CTGTAATTCT TCTGTTACTA CTTTTGTCTG TGATGGATCT TTGAGTTCCA 5220 

AGATAAAATA AGATACAGCT TTCGTAAATC CAGGCTCTTT CAAAATCGTT TCCATTTGAT 5280 

GAGACAGCAT GAAACTGTTG CTGTCCTCCA TGTCATCTTC ATCATTGATT ACACGTACAA 5340 

TCTTCGTTTG AAATTGAGCA ATCTTACTAG TTTCGGCAGC ACTTTGTACA ATGCTGGCTG 5400 

AGACTGATTT GCCAATAAGA TCATTAGCTG TCAAATTTTT TCCTGTCTGT TCATTCCAAT 5460 

TTTTTAGTAA ACTGCTTGGA ATCGTTAATC CCTGTTCATT TGTATCAGTA TAGAGGGATC 5520 

CAGCCAACAC TTTGTCCGTC TCATTATTAC TAACAGAGAT ACTTGTATCA TCATAAAGAG 5580 

TCACTACTTG AGCATAAGAA GGCATGGTTT GACTCAGATC CATTTCTTGC CCATCTATAG 5640 

TAATATTTGA CATGTTCATC CCAAAAGGAC TCTCCAAATA TTTAATAGCT TCTTTCCCAA 5700 

CTGTATCCGT GATATATAGT CAATTGAAAC AAGAGCAGGA TAAAAAAGCC TCGTAAAAGG - 5760 

TATTGCAACT TGGTAATACC TTTTTGAGGT GCTTTTTGAT ATGAGCCCAT GTTTTCTCAA 5820 

- TAGGATTGTA CTCAGGCGAG TAGGGAGGAA GAGGTAAAAG TTTATGCCCA AACTCTTCGC 5880 

ATAAAAGTTC TAGCTTCCCC ATTCTATGGA ATCTTACATT ATCCATAATA ATAACCGATG 5940 

GTGTGTTTAA TGTTGGTAAG AGAAAATTCT GAAACCAAGC TTCAAAAAAG TCGCTCGTCA 6000 

TCGTCTCTTC GTAAGTCATT GGAGCGATTA ATTCACCATT TGTTAGACCT GCAACCAAAG 6060 
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AAATCCTCTG ATATCTTCTT CCAGATACTT TGCCTCTTAT TAATTGACCT TTTAATGAGC 6120 

GACCATATTC TCGATAAAAA TAAGTATCGA ATCCTGTTTC GTCAATCTAA ACAGGTGCTA 6180 

GGTGCTTTAA ACTATTAAAA TTCTTAAGAA ATAAGGCTAC TTTTTCTGGG TCT TG T T CAT 6240 

AGTAGGTGTG GTTCTTTTTT CGAGTGTAGC CCATAGCTTT GAGCGTATAG TGGATGGTAG 6300 

TTGGATGACA GCCAAATTCA GAAGCTATTT CAGTCAAATA AGCGTCTGGA TTGTCAGTAA 6360 

GATAGTTTTT AAGTCTATCT CTATCAACCT TTCTTGGTTT TATTCCTTTT ACTTGGTGGT 6420 

TTAGCTCTCC TGTTTTCTCT TTTAGCTTTA ACCAGCCATA AATGGTATTA CGTGAGATTT 6480 

GGAAAACGTG TGATGCTTCT GTTATACTAC CTGTTCGCTC ACAATAAGAG AGAACTTTTT 6540 

TACGAAAATC TATTGAATAT GCCATAAAAA GATTATACCA CATTGTGTAC TATTTTTGGT 6600 

TCATTTTACT ATATTTGAAG AGGCGTTTAA ACTATCTGAC ATAAAACTCG TTCTAGAGGA 6660 

AAGACATCCT TTAAAAAGTT AGTTTATTTT ACAACTTAGA CATCAAGGTA GGTTAACCCC 6720 

TTCATGGAAA AATCAAGACT CTTAGCACTA TGGGTTAAAC TACCACTGGA GACGTAATCA 6780 

ATCGCTAAAC CACGAAAACG GCTAATAGTG GTCATATCAA TATTTCCAGA ACATTCAATC 6840 

CGAGAACGTC CTGCAATTAG GGTAATGGCC TGTTCAATCT GTTCCAATGA CATATTATCC 6900 

AACATGATAA TATCAGCACC CGCCGCCGCA GCTTCTTCGG CAGCAGCAAG GCTTTCCACT 6960 

TCCACCTCGA CCATTTTCAC AAAAGGGGCA TAGGCACGGG CTTGAGCAAT TGCCTTTTGA 7020 

ACACTACCTA CTGCCGCAAT GTGATTGTCT TTTAGCAGGA TAGCATCTGA TAAATTAAAG 7080 

CGATGATTAT AGCCACCGCC AACTCTCACG GCATATTTCT CAAAAAGACG TAAATTAGGA 7140 

GTAGTTTTTC GAGTATCAAA TACCTTAATG CAATCATCGC CTAAGGCTTC TACATAAGCA 7200 

GCTGTCATCG AAGCAATCCC TGATAAATGT TGTAAAAAAT TCAAGGCAAC GCGTTCACAT 7260 

GTTAAGAGAC TTCTCACCGA GCCTATGATT TCTAAAACCA AATCGCCACT AGTCAAACGA 7320 

TCCCCATCCT TAAATTGATG AGGATTCTGG AAGGTCACCT CGGCATCAAA TAGGGTAAAA 7380 

ACCCTTTGAA AAACGGTTAG CCCCGCTAAA ACACCAGCTT CCTTGGCAAA AAGCGAGACC 7440 

TTGGCTTGGC CATGATGATC AAAAATGGCA TTGGTACTGT AATCTTCGGA ATGAACATCT 7500 

TCTCGCAAGG CTGCTTTCAA TGTATCATCT ATTTGAAAAG GGGTTAAATC AGTTGAAATG 7560 

ATTGACATCA C 7571 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 26385 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

TTTGCTAGTG GCTTAAATTC TTCAGGAAAA TCAGGCGTAT CTAAAAGTCG TGTCGTTTTT 60 

GTTTCATCTA TATAAAGACT TCCTGCTCCC CCTACAACTA GAAAACGTGT CTGTGTTCCA 120 

GCAAGAAGCT GATTAAATAG TTCGATTGAT TTGCTGTGGA GCGGTAGCGT ATCTGGTGTA 180 

TAAGCACCAA ACGCTGAAAT AACAGCATCA AATCCAGTAA GATCATCTTT TGTCAACTCA 240 

AATAAATCTT TTTTAATAAT AGACTCAGCT TGACTTTTGT TTTCAGAACG AACAATAGCC 300 

GTTACTTCAT GTCCTCGTTT GACTGCTTCT TCAACAATTG CTTTCCCCGC TTGTCCATTT 360 

GCTGCAATAA CTGCTAGTTT CATTTTTTAT ACCTCTCTTG TTGTAATTAT TTTAGTTACA 420 

GAAATTGTGA CACTCTTAAT AATCAATGTC AATAGTCTTG CTTAATTATT ATCAAAATAT 480 

TTCTACCAAG AAAACTAACC ATGATTCTAG TGAAAAAAAA TCTTCTTTGT CAACAAATTT 540 

ACTTTCTTGT TTTAAACATG CTATAATAAT CATAGCAAGA GATCTAAGTT GTCTGTTTTT 600 

TTAAAACGAG GTGATTATCA TGCGTAGATT CTATTCCCAT CTCCCCTACT ATCTGGTCAT 660 

ATTATTCTTT TATTGGCCAC TTTATGAGTT GTTCTTACTA GTTGTTTCTG ACCCCCTTAC 720 

ACTCAAGGGA CTCTATATAA ACAATCTTCT CTTCTTTACA CCTCTGGTAA TCTTGATTGT 780 

ATCGTTACTC TATAGCTACC GTTTCCGTTT CTCACTTTGA TGGTTAGTTG GTAACGGACT 840 

GCTCTTTTAC TTTACTATCA TAACCTTTGG TGAGTTTATA CTAATTTACT TGCTAATCTA 900 

TGAAACAGTT GCTCTGGTCG GCATGGATTC TGGTATTAGC ATCAAGCATA TTCTACAAAA 960 

AATGAAAAAC AAAAAACTTT CACAAAATCC TTGAAAAATC TCACAATCAT GCTATAATAA 1020 

TCCATAGAGA CAAGTCACTT AGTCCCTTTC TACTAGAGAG TGCGTGGTTG CTGGAAACGC 1080 

ATAGGAAGTC TAAACTGATA CTACTCTTGA GTTTTTTATG AAAACATAAA ACGGTGGCCA 1140 

CGTTAGAGCC GATCAGAGGT GTCCCTCTCT TTTGAGGTAC ATAAATGAAG GTGGAACCAC 1200 

GTTGCGACGT CCTTTCGAGG ATGTCGCATT TTTTTATTAG GATACTAATT ATGGAGTTGC 1260 

AAGAATTAGT GGAGCGCAGT TGGGCAATCC GACAAGCTTA TCACGAACTG GAAGTTAAGC 1320 

ATCATGATTC CAAGTGGACG GTAGAAGAAG ACCTCTTGGC TTTATCTAAT GATATTGGAA 1380 

ATTTCCAACG ACTGGTGATG ACAAAGCAAG GACGCTACTA TGATGAAACA CCCTACACAC 1440 

TGGAACAAAA ACTTTCAGAA AATATCTGGT GGCTATTAGA ACTTTCTCAA CGTTTGGATA 1500 

TAGACATTCT GACGGAAATG GAAAACTTCC TCTCTGATAA AGAAAAGCAA TTGAACGTTA 1560 

GGACTTGGAA GTAGTCTGCT GATAAAAAAT CAATGCTTAG AAACTATGAA ATAATAAAAA 1620 
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AGGAGAACAT CATGATTAAC ATTACTTTCC CAGATGGCGC TGTTCGTGAA TTCGAATCTG 1680 

GCGTAACAAC TTTTGAAATT GCCCAATCTA TCAGCAATTC CCTAGCTAAA AAAGCCTTGG 1740 

CTGGTAAATT CAACGGCAAA CTCATCGACA CTACTCGCGC TATCACTGAA GATGGAAGCA 1800 

TCGAAATTGT GACACCTGAT CACGAAGATG CCCTTCCAAT CTTGCGTCAC TCAGCAGCTC I860 

ACTTGTTCGC CCAAGCAGCT CGTCGTCTTT TCCCAGACAT TCACTTGGGA GTTGGTCCAG 1920 

CCATCGAAGA TGGTTTCTAC TACGATACTG ACAACACAGC TGGTCAAATC TCTAACGAAG 1980 

ACCTTCCTCG TATCGAAGAA GAAAT GCAAA AAATCGTCAA AGAAAACTTC CCATCTATTC 2040 

GTGAAGAAGT GACTAAAGAC GAGGCACGTG AAATCTTCAA AAATGACCCT TACAAGTTGG 2100 

AATTGATTGA AGAACACTCA GAAGACGAAG GCGGTTTGAC TATCTATCGT CAGGGTGAAT 2160 

ATGTAGACCT CTGCCGTGGA CCTCACGTTC CATCAACAGG TCGTATCCAA ATCTTCCACC 2220 

TTCTCCATGT AGCTGGTGCG TACTGGCGTG GAAACAGCGA CAACGCTATG ATGCAACGTA 2280 

TCTACGGTAC AGCTTGGTTT GACAAGAAAG ACTTGAAAAA CTACCTTCAA ATGCGTGAAG 2340 

AAGCTAAGGA ACGTGACCAC CGTAAACTTG GTAAAGAGCT TGACCTCTTT ATGATTTCAC 2400 

AAGAAGTGGG ACAAGGTTTG CCATTCTGGT TGCCAAATGG TGCGACTATC CGTCGTGAAT 2460 

TGGAACGCTA CATCGTAAAC AAAGAGTTGG TTTCTGGCTA CCAACACGTC TACACTCCAC 2520 

CACTTGCTTC TGTTGAGCTT TACAAGACTT CTGGTCACTG GGATCATTAC CAAGAAGACA 2580 

TGTTCCCAAC CATGGACATG GGTGACGGGG AAGAATTTGT CCTTCGTCCA ATGAACTGTC 2640 

CGCACCACAT CCAAGTTTTC AAACACCATG TTCACTCTTA CCGTGAATTG CCAATCCGTA 2700 

TCGCTGAAAT CGGTATGATG CACCGTTACG AAAAATCTGG TGCCCTCACT GGCCTTCAAC 2760 

GTGTACGTGA AATGTCACTC AACGACGGTC ACCTATTCGT TACTCCAGAA CAAATCCAAG 2820 

AAGAATTCCA ACGTGCCCTT CAGTTGATTA TCGATGTTTA TGAAGACTTC AACTTGACTG 2880 

ACTACCGCTT CCGCCTCTCT CTTCGTGACC CTCAAGATAC TCATAAGTAC TTTGATAACG 2940 

ATGAGATGTG GGAAAATGCC CAAACCATGC TTCGTGCAGC TCTTGATGAA ATGGGCGTGG 3000 

ACTACTTTGA AGCCGAAGGT GAAGCAGCCT TCTACGGACC AAAATTGGAT ATCCAGATTA 3060 

AAACTGCCCT TGGAAAAGAA GAAACCCTTT CTACTATCCA ACTTGATTTC TTGTTGCCAG 3120 

AACGCTTCGA CCTCAAATAC ATCGGAGCTG ATGGCGAAGA TCACCGTCCA GTCATGATCC 3180 

ACCGTGGGGT TATCTCAACT ATGGAACGCT TCACAGCTAT CTTGATTGAG AACTACAAGG 3240 

GGGCCTTCCC AACATGGCTG GCACCACACC AAGTAACCCT CATCCCAGTA TCTAACGAAA 3300 

AACACGTGGA CTACGCTTGG GAAGTGGCCA AGAAACTCCG TGACGGCGGT GTCCGTGCAG 3360 
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ACGTAGATGA GCGCAATGAA AAAATGCAGT TCAAGATCCG TGCTTCACAA ACCAGCAAGA 3420 

TTCCTTACCA ATTAATTGTT GGAGACAAAG AAATGGAAGA CGAAACAGTC AACGTTCGTC 3480 

GCTACGGCCA AAAAGAAACA CAAACTGTCT CAGTTGATAA TTTTGTTCAA GCTATCCTAG 3540 

CTGATATCGC CAACAAATCA CGCGTTGAGA AATAAGAGTC TAGCATAAAA GCCTCCAATC 3600 

TGGAGGCTTT TTCTCATCTA TTTTTACTCA AGGACTAAGT TCACTTGAGC AAACTGAATC 3660 

CGCACTGTCG TTCCTTTTCC GACCTCAGAC TCGATACGAA TCTGGTGCCC CAGTTCTTCA 3720 

GAAATTTTCT TAGATAGATA AAGGCCAAGT CCAGAGGACT GCTGGGTCAA ACGGCCATTG 3780 

TATCCTGAAA AGCCACGTTC AAATACTCGG AGGACATCAC TGTTTTTTAT CCCGATTCCC 3840 

GTATCTTTGA TACAAAGCTC TTGGTCATCC ATATAAATCT CCAGACCACC TTCCTTGGTG 3900 

TACTTGAGAC TGTTTGAGAT GATTTGCTCA ATAACCACTA GCAGCCACTT TTTATCCGTC 3960 

ACGATTTCTT TATCAAGGTC ATGTAGATTG ACATTTAAGC CTTTTTGAAT AAAGAAAAGA 4020 

GCATATTTAC GAATTATTTC CTTGACCAAG TCCTCAATTT GAACCTGCTT TAAGACCAAA 4080 

TCATCATGGA AACTTTCTAA ACGCAGGTAC TGTAAAACTA GGTTGGTATA GGAGTCGATT 4140 

TTGAAAATTT CCTGTTCTAG CTGCTGCTTC AGTTGGCGGT CGACCACTTC TGCAACTAAG 4200 

AGTTGACTGG CTGCAATGGG GGTCTTTATC TGATGGACCC ACAAGGTATA GTAATCCAGC 4260 

AAATCCGTCA GTTTTCTTTC TGCTTTTGAC CTCTGCTGAT AGAGTTCCAT CTCACGCGCT 4320 

TCTAATTTTT CTGCTAAAGC TATTTCCAAA GGAGACTTGG CTTCCCTCTC TCCATAGAGA 4380 

AGTTCCTGGC GATAGACCTG CGTTTCCACC AATATGTCCC AAGTGAAAAA TAATATGGTT 4440 

ACAAAGCAAC ACAAGAAGAA AAAGTAGAGG AAGTAAATTC CTAGACTGGC AAATAAAAAC 4500 

TGAAAGAGTA AGACAAGAAA TGCCAAAGAA AGCAGATAGA TAAAAAGACG ACTACGGGAG 4560 

CGCAGATAGG CTAGAAAAAA TTGTTTCCAA TCAAGCATGC TTCAATCCGT ACCCTATTCC 4620 

TTTCTTGGTC TCGATAAATC CTACCAATCC CTGCTCCTCC AACTTTTTAC GCAAACGAGC 4 680 

CACATTGACA GAGAGGGTAT TATCATCAAT GAAAAAGTCA CTGTTCCAAA GTTCCCGCAT 4740 

CAGGTCGTCA CGTGCTACGA TGTTGCCTGC ATGCTCAAAT AACACGCGTA AAATCTGGAA 4800 

TTCATTCTTG GTCAAATTCA AGACTTGCCC TTGATAATGT AAATCCATGG ATTTGGTATT 4860 

GAGGATAACA CCAGCATATT CCAGCAAACT CTCATCACGC CCAAACTCAT AGGAACGACG 4920 

CAACAAGCCC TGAACCTTAG CTAAAAGAAC CTGCTGGTCA AAAGGCTTGG TCACAAAGTC 4980 

ATCCGCCCCC ATATTGATTG CCATGACAAT ATCCATAGCC TGGTCTCTCG AAGAAAGAAA 5040 

CATGATAGGT ACCTTGGAAA TCTTGCGGAT TTCCTGACAC CAGTGATAAC CATTAAACAA 5100 

GGGCAAACCA ATATCCATGA GGACCAGATG AGGTTCCGAC TGAACAAATA GACTCAAAAC 5160 
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TTCCATAAAG TCTTCTACCA GGACCACTTC AAATCCCCAT TCAGAGAGCA TTTTCCCAAT 5220 

CTGTTGACGA ATGACCTGAT CATCTTCTAT TAATAAAATC TTGTGCATGC GCTTCTCCTT 5280 

TTCCATTATT ATAACAGATT TTTCCATGCT AGATGGTCTG AAACTGAATT TGAAATAGCC 5340 

TGTTTTTAGC CAGTACAAAC AGGCTATGCT ACTAGCTAAT TTGAGGGAAA TTTGCTAAGA "5400 

TAAATAAAAA GAAAGGAGCT CTTATGGCCA ATATTTTTGA CTATCTGAAA GATGTCGCAT 5460 

ATGATTCTTA TTACGACCTT CCCTTGAATG AGTTAGACAT TCTAACCTTA ATAGAAATCA 5520 

CCTACCTCTC CTTTGATAAT CTGGTCTCCA CACTTCCTCA ACGTCTTTTA GATCTAGCAC 5580 

CTCAGGTTCC AAGAGATCCC AGCATGCTTA CTAGCAAAAA TCGCCTTCAA TTATTAGATG 5640 

AATTGGCTCA ACACAAGCGC TTCAAAAATT GCAAACTCTC CCATTTTATC AACGACATCG 5700 

ACCCTGAACT GCAAAAGCAA TTTGCGGCTA TGACTTATCG TGTCAGCCTC GATACCTATC 5760 

TGATTGTCTT TCGTGGGACA GATGACAGTA TCATTGGCTG GAAGGAAGAT TTCCACCTGA 5820 

CCTATATGAA GGAAATTCCT GCTCAAAAGC ACGCCGTTCG CTATTTAAAG AACTTTTTTG 5880 

CCCATCATCC TAAGCAAAAG GTTATTCTAG CTGGGCATTC CAAGGGAGGA AATCTCGCTA 5940 

TCTATGCTGC TAGCCAAATT GAGCAAAGTT TGCAAAATCA GATCACAGCA GTTTATACAT 6000 

TTGATGCACC TGGTCTCCAT CAAGAATTGA CACAGACTGC GGGTTATGAA AGGATAATGG 6060 

ATAGAAGCAA GATATTCATT CCACAAGGTT CCATTATCGG TATGATGCTG GAAATTCCTG 6120 

CTCACCAAAT GATCGTTCAG AGTACTGCCG TGGGTGGCAT CGCGCAGCAC GATACGTTTA 6180 

GTTGGCAGAT TGAGGACAAG CACTTCGTCC AACTGGATAA GACCAACAGT GATAGCCAGC 6240 

AAGTAGACAC AACCTTTAAA GAATGGGTGG CCACAGTCCC TGACGAAGAA CTTCAGCTCT 6300 

ACTTCGACCT CTTCTTTGGC ACTATTCTTG ATGCTGGTAT TAGCTCTATC AATGACTTGG 6360 

CTTCCTTAAA GGCGCTTGAA TACATTCATC ATCTCTTTGT CCAAGCTCAA TCCCTCACTC 6420 

CAGAAGAAAG AGAAACCTTG GGTCGCCTTA GGCAGTTATT GATTGATACT CGTTAGGAGG 6480 

CATGGAAAAA TAGATAATAC TCTTGAAAAT TAAATGTATA CAAAACAAAA GACCTAGAAT 6540 

ACATACTTTC ATGTGCATTC TAAGTCTTTT TAAATAGAAT CTAATAGTCA ATAAAAATCA 6600 

AAGAGCATTG AGAGATAATG GGGCTTGGAA CGTCCCTCTC GCTTCAACAA AATGAGCCCA 6660 

TTATAGATTA AAAAGATGCC ACTTAGAAAA AGCAAAAAAG GAAGTAAGAC AAAGGCAAAT 6720 

ATATAAAAAG CTAACTGAAC ATTCTCGTAT CCATTTTTAT AAAAAAGGTA GGATAGATAA 6780 

AAATAACTTG AAATGAGGGA TAATAAAAAT AATACTGGAT TCGAGAAACT TCTATTATCC 6840 

TTCCAAAATG ACACTATAAA GGCTAATACA ATTCCTATAA CGAGATACAT TTCTTACTCC 6900 
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TTTAATAGCT ACATTTTATC ATAATTATCC AAAGAAAAAA GAGGGCATTT ATCCCTCTTA 6960 

ATCCTTCATC TGACTCTCTG CATCGGCCAC GACTTTTTCT AGACTGGTTT GACCAAGTTC 7020 

TGCCTCCATA GTCAACTGAA TTCTCTCCAA TTTTTGATCC AAAACATCAT GAATATGAGC 7080 

TCCTACAGGG CAATTTGGAT TCGGATTGTC ATGGAAACTG AAGAGTTGAC CTGTCTTACC 7140 

AAGACATTCG ACCGCCTGAT AAACATCTAA AAGACTAATA TCCTTAAGGT CCTTGACAAT 7200 

CTCTGTTCCG CCCGTTCCAC GCGCTACTGA AATCAGCTCT GCCTTCTTCA ACTGGGACAA 7260 

GATCTTTCTG ATAATGACAG GATTGACCCC GACACTAGCA GCCAGAAAAT CACTGGTCAC 7320 

CTTGCTTTCC TTCCCCTCGA GGGCAATGAT TATCAGCATA . TGAGTCGCAA TGGTAAATCT 7380 

ACTTGGAATT TGCATCCTCT TCTCCTTTTT ACGAGGCTAC CCTGCCTCTA CTCTTCTTTT 7440 

TCTATTATTA TACCCTTTTT AGTTGTAATG TCAATCGTTA CCACTTTTGA ACCAGTCGTC 7500 

TAACTCCCGA TCGCAGCCCT CTTTCTGAGC CAATTCTCTC AAAAATTCCT GATGATGAGT 7560 

ATGGTGGATC CCATTGACCA GACTTTCATA GTAAACCTCA AAATAGGGAA GTCTCAGGTC 7620 

TTTAGCCAGC TGCAATTCAG CTGCTACATC GTAGTCTACC CGTCGGAAGT CCATATCTAC 7680 

CAGGCCTTTG TCATCAAACT CCAAAATCAT ATACTGGGCC CGCAAGTCCT TCCGTAGCTG 7740 

AGCGTCCAAA AAGAAAGGTT GGCCAATCGA ACCCGGATTG ACAATCAATT GCCCACCAGT 7800 

CCCGTAACGA AGCAACTGCT GGTGAATATG TCCATAAACA GCAATATCAC AGGGAGGATG 7860 

AGTCACCAAG CGGTCAAACT CCTCTTGTTT GCCAGTATGA ATCAACTCTC GCCCCCAGTT 7920 

CTTATCAGGC AGATGATGGC TAATTCCCAC CGTCAAATCC CCAAACTGAC GATGAATTTG 7980 

AAGAGGTTGA TTGTGGAGCA CTTCAATTTC TTCTAGGGAA ATTTCCTCTA AAACATACTG 8040 

GCACTGGCGC AAGAGATAGC GTTGACTGGG GCGAGTACTG TCCAATTCCT TACGGACACC 8100 

ATGCCAAAGA CTGTCTTCCC AGTTTCCCAA AACTCTAGCC GTAATCGGTA GTTGATCCAA 8160 

CAAGTCCAAA ATCCTTGTAC GCCCTGTCCC TGGCATGAGA ATATCTCCCA AAAGCCAGTA 8220 

TTCATCCACT CCTATCTGCC GAGCATCTGC CAAAACAGCC TCCAAGGCGG TGGTATTTCC 8280 

ATGAATATCT GAAAGAAGAG CTATTTTCGT CATATCCATC TCCTCGTTTT TTCTCTTGCA 3340 

ATAAGTATAA CATAAAAAGT CACAGCTAGA GAAATCTAGC TTTTTTTGAT ATACTAGATA 8400 

AAGATATTAG ACAAGAGGAA ACGAATGACC CCAAACAAAG AAGACTATCT AAAATGTATT 8460 

TATGAAATTG GCATAGACCT GCATAAGATT ACCAACAAGG AAATTGCGGC TCGCATGCAA 8520 

GTCTCTCCCC CTGCCGTAAC TGAAATGATC AAACGAATGA AAAGTGAAAA TCTCATCCTA 8580 

AAGGACAAGG AATGTGGCTA TCTACTGACT GACCTCGGTG TCAAACTGGT CTCTGAGCTC 8640 

TATCGTAAGC ACCGCTTGAT TGAAGTTTTT CTAGTTCATC ATTTAGACTA TACAAGTGAC 8700 
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CAGATTCACG AGGAAGCTGA GGTCTTGGAA CACACTGTCT CTGACCTGTT CGTGGAAAGA 8760 

CTAGATAAAC TGCTAGGTTT CCCTAAAACC TGCCCCCACG GGGGAACTAT TCCTGCCAAG 8820 

GGAGAACTAC TCGTTGAAAT CAATAACCTC CCACTAGCTG ATATCAAGGA AGCTGGCGCC 8880 

TACCGCCTGA CTCGGGTGCA CGATAGTTTT GACATTCTCC ATTATCTGGA CAAGCACTCA 8940 

CTTCACATCG GTGACCAGCT CCAAGTCAAG CAGTTTGATG GCTTCAGCAA TACCTTCACT 9000 

ATCCTCAGTA ACGACGAGGA TTTACAAGTG AATATGGAGA TTGCAAAACA ACTCTATGTC 9060 

GAGAAAATCA ACTAATTTCT CAAGTCCCCT ACCAACCCTG AAAGTTTTAT TTTGGCTCTT 9120 

TGTCAACTGT AGTGGGTTGA AGTCAGCTAA GCTCGAGAAA GGACAAATTT TGTCCTTTCT 9180 

TTTTTGATAT TCAGAGCGAT AAAAATCCGT TTTTTGAAGT TTTCAAAGTT CCGAAAACCA 9240 

AAGGCATTGC GCTTGATAAG TTTGATGAGA TTATTGGTCG CTTCCAGTTT GGCATTAGAA 9300 

TAGTGTAGTT GAAGGGCGTT GACAATCTTT TCTTTATCTT TGAGGAAGGT TTTAAAGACA 9360 

GTCTGAAAAA TAGGATGAAC CTGCTTTAGA TTGTCCTCAA TGAGTCCGAA AAATTTCTCC 9420 

GGTTTCTTAT TCTGAAAGTG AAACAGCAAG AGTTGATAGA GCTGATAGTG GTGTTTCAAG 9480 

TCTTGTGAAT AGCTCAAAAG CTTGTCTAAA ATCTCTTTAT TGGTTAAGTG CATACGAAAA 9540 

GTAGGACGAT AAAATCGCTT ATCACTCAGT TTACGGCTAT CCTGTTGTAT GAGCTTCCAG 9*600 

TAGCGCTTGA TAGCCTTGTA TTCATGGGAT TTTCGATCCA ATTGGTTCAT AATTTGAACA 9660 

CGCACACGAC TCATAGCACG GCTAAGATGT TGTACAATGT GAAAGCGATC CAACACGATT 9720 

TTAGCATTCG GGAGTGAAAC AGTCTGGGAG ACTGTTTCAG CCTGAGGCTA GAAATTTGAA 9780 

AGCGAAGCTG TTTAGCCAAG TCATAGTAAG GACTAAACAT ATCCATCGTA ATGATTTTGA 9840 

CTTGACAACG AACGGCTCTA TCGTAGCGAA GAAAGTGATT TCGGATGACA GCTTGTGTTC 9900 

TGCCTTCAAG AACAGTGATA ATATTAAGAT TATCAAAATC TTGCGCAATG AAACTCATCT 9960 

TTGCCTTAGT GAAGGCATAG TCATCCCAAG ACATAATCTT TGGAAGGGGA GAAAAATCAT 10020 

GCTCAAAGTG AAAGTCATTG AGCTTGCGAA TGACAGTTGA AGTTGAAATG GCCAGCTGAT 10080 

GGGCAATATC AGTCATAGAA ATTTTTTCAA TTAACTTTTG AGCAATyTTT TGGTTGATGA 10140 

TACGAGGGAT TTGGTGATTT TTCTTTACCA GGGGAGTCTC AGCAAGGATC ATTTTTGAAC 10200 

AGTGATAGCA CTTGAAACGA CGCTTTCTAA GGAGAATTCT AGAAGGCATA CCAGTCGTTT 10260 

CAAGATAAGG AATTTTAGAA GGTTTTTGAA AGTCATATTT CTTCAATTGG TTTCCGCAGT 10320 

CAGGGCAAGA TGGGGCGTCG TAGTCCAGTT TGGCGATGAT TTCCTTGTGT GTATCCTTAT 10380 

TGATGATGTC TAAAATCTGG ATATTAGGGT CTTTAATGTG TAGTAATTTT GTGATAAAAT 10440 
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GTAATTGTTC CATATGATTC TTTCTAATGA GTTGTTTTGT CGCTTTTCAT TATAGGTCAT 10500 

ATGGGACTTT TTTTCTACAA TAAAATAGGC TCCATAATAT CTATAGTGGA TTTACCCACT 10560 

ACAAATATTA TAGAACCGTA AAAATAGAAG GAGATAGCAG GTTTTCAAGC CTGCTATCTT 10620 

TTTTTGATGA CATTCAGGCT GATACGAAAT CATAAGAGGT CTGAAACTAC TTTCAGAGTA -10680 

GTCTGTTCTA TAAAATATAG TAGATTGAAA TAAGATGTGA ACAACTCTAT CAGGAAAGTC 10740 

AAATTAATTT ATAGAATTAT TTTAGCAGTC AAGGTGTACT GTTATAGATT CAATATATTA 10800 

TATGACTATT AACCTTGTCT TCTCCTAAAA TTGACTTTCT TGTTTTCTTA TCTTGTCCAC 10860 

TCGAAACAAG TATTGTAAGA ATTTGATTAT TTTTGAAAGT ACTTTTAATA TACTTGATAT 10920 

AGTTAAAAAA GATTTGAAAC TAAATTCCAA ATTAGAAAAA GACTTGAAAT ACTAAAAAAA 10980 

AAAAAGTATA CTCTAATTGA AAACGGTAAC AAAACTAATT TAGAGAATGA AATATAGAGT 11040 

ATTTCTCTCT TAAAAGTTTT TGGTGAAACG AGATGTAGAA AGGAGATTTA GCCAAAGAGT 11100 

CTATTAGTGC TAGAATAATA GATTAGAATT ATTTTAGAAA AACGAAGTGA GCAGCTTATA 11160 

AATTCAAGTC CCCAAATAGA TTCATACTAG TATCTTTTGC AAAAAATAAA GGGCGACTTC 11220 

CTTCATGAAT ATCAATTTCA TCTATAAGGA AGGTAGCTAA TTGAACTAAC TTATTTATTC . 11280 

TGTTTGTCGC TAGAAAAATC AGACCTCCTT GTGAAGATTG AGGAGATACT TAATGAAAAT 11340 

CAAAGAAGAA ACTAGCAAGC TAGTAGCAGA TTGCCCAAAA CACCGCTTTG AGGTTGTAGA 11400 

TAAGACTGAC CTATATAATC CAAGGTGAAG CGACTGTGGT TTGAAGAGAT TTTCAAAGAG 11460 

TATAGGCTAG AGAGTAGTGT TTTTATGTCC TTCTAGTAGA AAATGCTAGA CAGAAGAATG 11520 

GGGAACTTGG ATAGGAAAAA TAGATTGAGA AAGGAGGTTA GAAGAGATGA TTATTACAAA 11580 

AATTAGCCGT TTAGGAACTT ATGTGGGAGT AAATCCACAT TTTGCAACAT TAATAGATTT 11640 

TCTAGAAAAA ACAGGACTAG AAAATTTAAC AGAAGGTTCG ATTGCTATCG ATGGTAATCG 11700 

ATTGTTTGGG AATTGCTTTA CTTATCTAGC AGATGGTCAA GCAGGGGGTT TCTTTGAAAC 11760 

CCACCAAAAA TATTTGGATA TTCATTTAGT TTTGGAAAAC GAAGAAGCCA TGGCTGTTAG 11820 

ATCGCCGGAA AATGTAAGCG TTACCCAAGA ATATGATGAA GAGAAAGATA TTGAATTATA 11880 

CACAGGGAAA GTGGAACAGT TGGTTCATTT GAGAGCTGGC GAATGCCTCA TCACTTTTCC 11940 

AGAAGATTTA CATCAACCCA AGGTTCGTAT AAATGATGAA CCTGTGAAAA AAGTTGTCTT 12000 

TAAAGTTGCG ATTTCTTAAT GTAGAAAGAG AAGAACGATG AAAAAAATGA GAAAGTTTTT 12060 

ATGTCTAGCT GGAATTGCGC TAGCGGCTGT TGCCTTGGTA GCTTGTTCAG GAAAAAAAGA 12120 

AGCTACAACT AGTACTGAAC CACCAACAGA ATTATCTGGT GAGATTACAA TGTGGCACTC 12180 

CTTTACTCAA GGACCCCGTT TAGAAAGTAT TCAAAAATCA GCAGATGCTT TCATGCAAAA 12240 
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GCATCCAAAA ACGAAAATCA AGATTGAAAC ATTTTCTTGG AATGACTTCT ATACTAAATG 12300 

GACTACAGGT TTAGCAAATG GAAATGTGCC AGATATCAGT ACAGCTCTTC CTAACCAAGT 12360 

AATGGAAATG GTCAACTCAG ATGCTTTGGT TCCGCTAAAT GATTCTATCA AGCGTATTGG 12420 

ACAAGATAAA TTTAACGAAA CTGCCTTAAA TGAAGCAAAA ATCGGAGATG ATTACTACTC 12480 

TGTTCCTCTT TATTCACATG CACAAGTCAT GTGGGTTAGA ACAGATTTGT TAAAAGAACA 12540 

TAATATTGAG GTTCCTAAAA CTTGGGATCA ACTCTATGAA GCTTCTAAAA AATTGAAAGA 12600 

AGCTGGAGTT TATGGCTTGT CTGTTCCGTT TGGAACAAAT GACTTAATGG CAACACGTTT 12660 

CTTGAACTTC TACGTACGTA rTGGTGGAGG AAGCCTCTTA ACAAAAGATC TTAAAGCAGA 12720 

CTTGACAAGC CAACTTGCTC AAGATGGTAT TAAATACTGG GTTAAATTGT ATAAAGAAAT 12780 

CTCACCTCAA GATTCTTTGA ACTTTAATGT CCTTCAACAA GCTACCTTGT TCTATCAAGG 12840 

AAAAACAGCA TTTGACTTTA ACTCTGGCTT CCATATCGGA GGAATTAATG CCAACAGTCC 12900 

TCAATTGATT GATTCGATTG ATGCTTATCC TATTCCAAAA ATCAAAGAGT CTGATAAAGA 12960 

CCAAGGAATT GAAACCTCAA ACATTCCAAT GGTTGTTTGG AAAAATTCAA AACATCCAGA 13020 

AGTTGCTAAA GCATTCTTAG AAGCACTTTA TAATGAAGAA GACTACGTTA AATTCCTTGA 13080 

TTCAACTCCA GTAGGTATGT TGCCAACTAT TAAGGGGATT AGCGATTCTG CAGCCTATAA 13140 

AGAAAATGAA ACTCGTAAGA AATTTAAACA TGCTGAAGAA GTAATTACTG AAGCTGTTAA 13200 

AAAAGGTACT GCTATTGGTT ATGAAAATGG GCCAAGTGTA CAAGCTGGTA TGTTGACTAA 13260 

CCAACACATT ATTGAACAAA TGTTCCAAGA TATCATTACA AATGGAACAG ATCCTATGAA 13320 

AGCAGCAAAA GAAGCAGAAA AACAATTAAA TGATTTATTT GAGGCTGTTC AGTAGATGTA 13380 

AAAGACTAGA AAATAGGTGG GATAGTGAGC TGAAAAGCTC TAGCCCAATC TTGTAAAAGA 13440 

AGGGAGAAGG AGAATGGTTA AAGAACGTAA TTTAACTCGC TGGATATTTG TTTTGCCAGC 13500 

TATGATTATC GTAGGATTAC TCTTTGTTTA TCCGTTTTTC TCGAGTATTT TTTATAGCTT 13560 

TACCAATAAG CATTTGATTA TGCCTAATTA TAAATTTGTT GGTTTGGCTA ACTATAAAGC 13620 

TGTGCTATCA GATCCCAACT TCTTTAATGC GTTCTTTAAT TCAATTAAGT GGACCGTTTT 13680 

CTCATTAGTT GGTCAAGTTT TAGTAGGGTT TGTATTGGCT TTAGCTCTTC ACAGAGTACG 13740 

CCACTTCAAG AAATTATATA GGACATTATT GATTGTTCCT TGGGCATTTC CTACCATCGT 13800 

TATTGCCTTC TCTTGGCAGT GGATTCTAAA CGGGGTTTAT GGCTACTTAC CTAATCTAAT 13860 

CGTAAAATTA GGTTTAATGG AACATACACC TGCATTTTTG ACAGATAGTA CATGGGCATT 13920 

CCTATGTTTG GTGTTTATCA ACATTTGGTT TGGAGCACCA ATGATTATGG TTAATGTGCT 13980 
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TTCAGCTTTG CAAACAGTAC CAGAAGAACA ATTTGAGGCT GCTAAGATAG ATGGTGCTTC 14040 

AAGTTGGCAG GTGTTCAAGT TTATCGTCTT TCCACATATT AAAGTGGTTG TAGGACTTCT 14100 

AGTTGTTTTG AGAACTGTAT GGATCTTTAA TAACTTTGAC ATTATCTACC TCATTACTGG 14160 

TGGTGGACCA GCCAATGCTA CAACGACGCT TCCAATTTTT GCTTACAACC TGGGCTGGGG 14220 

AACTAAATTG TTGGGTCGTG CTTCAGCAGT TACAGTACTG CTCTTTATCT TCTTGGTGGC 14280 

GATTTGCTTT ATCTACTTTG CTATCATCAG TAAGTGGGAA AAGGAGGGTA GAAAATAATG 14340 

AAGAAGAAAT CCAGTATTTA TTTAGATATT CTCTCACATG TACTTTTAGT TGGTGGGACC 14400 

ATCGTTGCAG TTTTCCCATT GGTATGGATT ATCATATCTT CTGTCAAAGG GAAAGGGGAA 14460 

TTAACTCAGT ATCCAACACG ATTTTGGCCT GAACAGTTTA CATTAGATTA TTTCACTCAT 14520 

GTTATCAACG ATTTGCACTT CATTGATAAC ATTCGAAACA GTTTAATCAT TGCCTTGGCT 14580 

ACAACCCTTA TTGCGATTAT TATTTCTGCT ATGGCAGCCT ATGGTATTGT TCGATTCTTT 14640 

CCTAAATTGG GAGCAATCAT GTCGAGACTA CTCGTCATTA CCTACATTTT CCCACCAATT 14700 

TTGTTAGCAA TTCCCTATTC AATTGCCATT GCTAAAGTTG GGTTAACAAA TAGTTTATTT 14760 

GGCTTGATGA TGGTTTATCT ATCTTTTAGT GTTCCATATG CAGTTTGGCT CTTAGTTGGA 14820 

TTTTTCCAAA CAGTTCCAAT TGGAATTGAA GAAGCGGCTA GAATTGATGG TGCAAATAAA 14880 

TTTGTTACGT TTTATAAAGT TGTGCTACCG ATTGTAGCAC CAGGTATTGT AGCAACAGCT 14940 

ATTTATACAT TTATCAATGC TTGGAATGAA TTCCTGTATG CCTTGATTTT GATTAACAAT 15000 

ACAGGAAAGA TGACAGTAGC AGTAGCCCTT CGTTCACTTA ATGGTTCAGA AATACTAGAC 15060 

TGGGGAGATA TGATGGCAGC GTCTGTTATT GTAGTTCTTC CATCAATTAT TTTCTTCTCT 15120 

ATCATCCAAA ATAAGATTGC AAGTGGATTA TCAGAAGGAT CTGTGAAGTA GACGAAAGAA 15180 

GGAAAAAAAT GAATAAAAGA GGTCTTTATT CAAAACTAGG AATTTCCGTT GTAGGCATTA 15240 

GTCTTTTAAT GGGAGTCCCC ACTTTGATTC ATGCGAATGA ATTAAACTAT GGTCAACTGT 15300 

CCATATCTCC TATTTTTCAA GGAGGTTCAT ATCAACTGAA CAATAAGAGT ATAGATATCA 15360 

GCTCTTTGTT ATTAGATAAA TTGTCTGGAG AGAGTCAGAC AGTAGTAATG AAATTTAAAG 15420 

CAGATAAACC AAACTCTCTT CAAGCTTTGT TTGGCCTATC TAATAGTAAA GCAGGCTTTA 15480 

AAAATAATTA CTTTTCAATT TTCATGAGAG ATTCTGGTGA GATAGGTGTA GAAATAAGAG 15540 

ACGCCCAAAA GGGAATAAAT TATTTATTTT CCAGACCAGC TTCATTATGG GGAAAACATA 15600 

AAGGACAGGC AGTTGAAAAT ACACTAGTAT TTGTATCTGA TTCTAAAGAT AAAACATACA 15660 

CAATGTATGT TAATGGAATA GAAGTGTTCT CTGAAACAGT TGATACATTT TTGCCAATTT 15720 

CAAATATAAA TGGTATAGAT AAGGCAACAC TAGGAGCTGT TAATCGTGAA GGTAAGGAAC 15780 
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ATTACCTCGC AAAAGGAAGT ATTGATGAAA TCAGTCTATT TAACAAAGCA ATTAGTGATG 15840 

AGGAAGTTTC AACTATTCCC TTGTCAAATC CATTTCAGTT AATTTTCCAA TCAGGAGATT 15900 

CTACTCAAGC TAACTATTTT AGAATACCGA CACTATATAC ATTAAGTAGT GGAAGAGTTC 15960 

TATCAAGTAT TGATGCACGT TATGGTGGGA CTCATGATTC TAAAAGTAAG ATTAATATTG 16020 

CCACTTCTTA TAGTGATGAT AATGGGAAAA CGTGGAGTGA GCCAATTTTT GCTATGAAGT 16080 

TTAATGACTA TGAGGAGCAG TTAGTTTACT GGCCACGAGA TAATAAATTA AAGAATAGTC 16140 

AAATTAGTGG AAGTGCTTCA TTCATAGATT CATCCATTGT TGAAGATAAA AAATCTGGGA 16200 

AAACGATATT ACTAGCTGAT GTTATGCCTG CGGGTATTGG AAATAATAAT GCAAATAAAG 16260 

CCGACTCAGG TTTTAAAGAA ATAAATGGTC ATTATTATTT AAAACTAAAG AAGAATGGAG .16320 

ATAACGATTT CCGTTATACA GTTAGAGAAA ATGGTGTCGT TTATAATGAA ACAACTAATA 16380 

AACCTACAAA TTATACTATA AATGATAAGT ATGAACPTTT GGAGGGAGGA AAGTCTTTAA 16440 

CAGTCGAACA ATATTCGGTT GATTTTGATA GTGGCTCTTT AAGAGAAAGG CATAATGGAA 16500 

AACAGGTTCC TATGAATGTT TTCTACAAAG ATTCGTTATT TAAAGTGACT CCTACTAATT 16560 

ATATAGCAAT GACAACTAGT CAGAATAGAG GAGAGAGTTG GGAACAATTT AAGTTGTTGC 16620 

CTCCGTTCTT AGGAGAAAAA CATAATGGAA CTTACTTATG TCCCGGACAA GGTTTAGCAT 16680 

TAAAATCAAG TAACAGATTG ATTTTTGCAA CATATACTAG TGGAGAACTA ACCTATCTCA 16740 

TTTCTGATGA TAGTGGTCAA ACATGGAAGA AATCCTCAGC TTCAATTCCG TTTAAAAATG 16800 

CAACAGCAGA AGCACAAATG GTTGAACTGA GAGATGGTGT GATTAGAACA TTCTTTAGAA 16860 

CCACTACAGG TAAGATAGCT TATATGACTA GTAGAGATTC TGGAGAAACA TGGTCGAAAG ' 16920 

TTTCGTATAT TGATGGAATC CAACAAACTT CATATGGCAC ACAAGTATCT GCAATTAAAT 16980 

ACTCTCAATT AATTGATGGA AAAGAAGCAG TCATTTTGAG TACACCAAAT TCTAGAAGTG 17040 

GCCGCAAGGG AGGCCAATTA GTTGTGGGTT TAGTCAATAA AGAAGATGAT AGTATTGATT 17100 

GGAAATACCA CTATGATATT GATTTGCCTT CGTATGGTTA TGCCTATTCT GCGATTAGAG 17160 

AATTGCCAAA TCATCACATA GGTGTACTGT TTGAAAAATA TGATTCGTGG TCGAGAAATG 17220 

AATTGCATTT AAGCAATGTA GTTCAGTATA TAGATTTGGA AATTAATGAT TTAACAAAAT 17280 

AAAGGAGAAA AACATGGTTA AATACGGTGT TGTTGGAACA GGGTATTTTG GAGCTGAATT 17340 

GGCTCGCTAC ATGCAAAAGA ATGATGGAGC AGAGATTACT CTTCTCTATG ATCCAGATAA 17400 

TGCAGAGGCG ATTGCAGAAG AATTGGGAGC AAAAGTAGCA AGTTCCTTAG ATGAGTTGGT ' 17460 

TTCTAGCGAT GAAGTAGATT GTGTTATCGT CGCAACTCCA AATAATGTTC ATAAGGAACC 17520 
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GGTTATTAAG GCTGCACAGC ATGGTAAAAA TGTTTTCTGT GAAAAACCAA TTGCGCTTTC 17580 

TTATCAAGAT TGTCGCGAGA TGGTAGATGC GTGTAAAGAA AACAATGTAA CCTTTATGGC 17640 

AGGACATATT ATGAATTTCT TTAATGGTGT TCATCATGCA AAAGAACTCA TTAATCAAGG 17700 

AGTTATCGGA GACGTTCTAT ATTGTCATAC AGCTCGTAAT GGTTGGGAAG AACAACAACC 17760 

GTCAGTATCA TGGAAAAAAA TTCGTGAAAA ATCAGGTGGT CACTTGTATC ACCACATCCA 17820 

TGAATTGGAT TGCGTTCAAT TCCTTATGGG GGGCATGCCT GAAACTGTAA CCATGACAGG 17880 

TGGAAATGTG GCCCATGAAG GTGAACATTT CGGTGATGAA GATGATATGA TTTTTGTCAA 17940 

TATGGAATTT TCTAATAAGC GTTTTGCCTT GTTAGAATGG GGTTCAGCTT ATCGTTGGGG 18000 

TGAACATTAT GTCTTAATCC AAGGAAGCAA AGGTGCCATC CGCTTAGACT TATTCAACTG 18060 

TAAAGGAACT CTTAAGCTAG ATGGGCAAGA AAGCTATTTC TTGATTCACG AATCGCAAGA 18120 

AGAAGATGAT GATCGGACTC GTATCTATCA TAGTACAGAG ATGGATGGAG CAATTGCTTA 18180 

TGGTAAACCA GGTAAACGTA CTCCATTATG GCTATCATCT GTCATTGATA AAGAAATGCG 18240 

CTATCTGCAT GAGATTATGG AAGGAGCTCC AGTATCAGAA GAATTTGCAA AACTTTTGAC 18300 

AGGTGAAGCT GCCCTAGAAG CAATTGCTAC TGCAGATGCT TGTACCCAGT CTATGTTTGA 18360 

AGATCGCAAA GTAAAATTGT CAGAAATTGT AAAATAAATT TTGGTATTCT CCTATTTATA 18420 

GGTCGACTTG CTCCTCTGAA AGTACTTTTA GAGGAGCTGT TTGACTTTGC TAGTTTTTGA 18480 

AACTGAAATC TATTATACTA CAAACTATTG AAAGCGTTTT AATTTTAAGG TATAATAATC 18540 

TCATAGAAAT AAAGAAAAGG AGGAAAGAGG ATGCCACAGA TTAGCAAAGA AGCCTTGATT 18600 

GAGCAAATCA AAGATGGAAT CATCGTTTCT TGTCAGGCTC TTCCTCATGA ACCGCTTTAT 18660 

ACAGAAGCGG GAGGGGTGAT TCCCTTGCTG GTCAAAGCGG CTGAGCAAGG TGGAGCAGTC 18720 

GGTATCCGAG CAAACAGTGT TCGCGATATC AAGGAAATTA AGGAAGTCAC TAAACTTCCA 18780 

ATCATTGGGA TTATCAAACG TGATTATCCA CCTCAGGAAC CCTTCATCAC GGCTACTATG 18840 

AAAGAAGTTG ATGAATTGGC AGAACTGGAC ATCGAGGTGA TTGCTCTGGA TTGTACCAAG 18900 

CGTGAACGCT ACGATGGTTT GGAAATTCAA GAGTTCATTC GTCAGGTTAA GGAGAAATAT 18960 

CCTAATCAGC TTTTGATGGC TGATACTAGT ATCTTCGAAG AAGGGCTAGC AGCTGTAGAA 19020 

GCAGGAATTG ACTTTGTCGG AACAACCTTA TCAGGCTACA CATC CT AC AG TCCAAAAGTA 19080 

GACGGTCCAG ATTTTGAATT GATTAAGAAA CTCTGTGATG CTGGTGTAGA TGTCATTGCA 19140 

GAAGGAAAAA TTCATACACC AGAACAAGCC AAACAAATCC TTGAATATGG AGTGCGAGGC 19200 

ATCGTTGTTG GTGGCGCCAT TACTAGACCA AAAGAGATTA CAGAACGCTT CGTTGCTAGT 19260 

CTTAAATAAG ATGTGAGGGG GAGTTTTATG TTTAAAGTTT TACAAAAAGT TGGAAAAGCT 19320 



WO 98/18931 



PCT/US97/19588 



169 

TTTATGTTAC CTATAGCTAT ACTTCCTGCA GCAGGTCTAC TTTTGGGGAT TGGTGGTGCA 19380 

CTTTCAAACC CAACCACGAT AGCAACTTAT CCAATACTAG ACAATAGTAT TTTTCAATCA 19440 

ATATTCCAAG TAATGAGCTC TGCAGGAGAG GTTGTATTCA GTAATTTGTC ACTACTTCTC 19500 

TGTGTGGGAT TATGTATTGG CTTAGCGAAA CGAGATAAAG GAACCGCTGC GTTAGCAGGA 19560 
GTAACTGGTT ACTTAGTTAT GACTGCAACG ATCAAAGCTT TGGTAAAACT TTTTATGGCA . 19620 

GAAGGATCTG CAATTGATAC TGGAGTTATT GGAGCATTAG TTGTCGGAAT AGTTGCCGTA 19680 

TATTTGCACA ACCGATATAA CAATATTCAA TTACCTTCCG CTTTAGGATT CTTTGGAGGT 19740 

TCACGCTTCG TTCCTATTGT TACATCGTTG TCTTCTATCT TGATTGGCTT TGTCTTCTTT 19800 

GTTATTTGGC CACCTTTCCA ACAACTTCTT GTTTCTACAG GTGGATATAT TTCTCAGGCG 19860 

GGTCCAATTG GAACTTTTCT ATATGGATTT TTAATGAGAC TTTCTGGAGC AGTAGGCTTA 19920 

CATCATATAA TTTACCCTAT GTTTTGGTAT ACTGAACTTG GTGGTGTTGA AACTGTTGCA 19980 

GGACAAACAG TGGTTGGAGC TCAAAAAATA TTTTTTGCTC AATTAGCCGA TTTGGCCCAT 20040 

TCTGGATTAT TTACAGAAGG AACAAGGTTT TTTGCAGGTC GTTTCTCAAC AATGATGTTC 20100 

GGTTTACCGG CTGCCTGTTT AGCGATGTAC CATAGTGTTC CTAAAAATCG TCGTAAAAAA 20160 

TACGCGGGTT TGTTTTTTGG AGTTGCTTTA ACATCTTTTA TTACCGGTAT TACAGAACCA 20220 

ATTGAATTTA TGTTTCTATT CGTCAGTCCG GTTCTATATG TTGTTCACGC ATTCCTTGAT 20280 

GGTGTTAGCT TCTTTATTGC AGACGTCTTA AATATTTCAA TAGGAAACAC ATTTTCAGGA 20340 

GGTGTAATCG ATTTCACTTT ATTTGGAATT TTGCAGGGGA ACGCTAAGAC GAATTGGGTT 20400 

CTTCAGATTC CATTTGGACT TATTTGGAGT GTTTTGTATT ATATTATTTT TAGATGGTTC 20460 

ATTACTCAAT TCAACGTTCT AACGCCAGGG CGAGGAGAAG AAGTAGATTC TAAAGAAATT 20520 

TCTGAATCCG CAGATTCAAC TTCAAATACT GCAGATTATT TAAAACAGGA TAGCCTACAA 20580 

ATTATCAGAG CCTTGGGTGG ATCAAATAAT ATAGAAGATG TAGATGCTTG TGTGACACGT 20640 

TTACGTGTAG CTGTAAAAGA AGTTAATCAA GTTGATAAAG CACTTTTAAA ACAAATTGGT 20700 

GCAGTTGATG TCTTAGAAGT GAAGGGTGGC ATTCAAGCAA TCTATGGAGC AAAAGCAATC 20760 

TTATATAAAA ATAGTATTAA TGAAATTTTA GGTGTAGATG ATTAAGTACT TACTGACTTA 20820 

ATAAAAAACA GAGGAGAGTG ATGGATGAGT AGGATGAAAT GAAATCGCAT ACAAGAAATA 20880 

AAGAACTCAT TATCCAAGTT GGATACGCTT ATTACATAGG AGAATACAAA TGAAATTTAG 20940 

AAAATTAGCT TGTACAGTAC TTGCGGGTGC TGCGGTTCTT GGTCTTGCTG CTTGTGGCAA 21000 

TTCTGGCGGA AGTAAAGATG CTGCCAAATC AGGTGGTGAC GGTGCCAAAA CAGAAATCAC 21060 



r 
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TTGGTGGGCA TTCCCAGTAT TTACCCAAGA AAAAACTGGT GACGGTGTTG GAACTTATGA 
AAAATCAATC ATCGAAGCGT TTGAAAAAGC AAACCCAGAT ATAAAAGTGA AATTGGAAAC 
CATCGACTTC AAGTCAGGTC CTGAAAAAAT CACAACAGCC ATCGAAGCAG GAACAGCTCC 
AGACGTACTC TTTGATGCAC CAGGACGTAT CATCCAATAC GGTAAAAACG GTAAATTGGC 
TGAGTTGAAT GACCTCTTCA CAGATGAATT TGTTAAAGAT GTCAACAATG AAAACATCGT 
ACAAGCAAGT AAAGCTGGAG ACAAGGCTTA TATGTATCCG ATTAGTTCTG CCCCATTCTA 
CATGGCAATG AACAAGAAAA TGTTAGAAGA TGCTGGAGTA GCAAACCTTG TAAAAGAAGG 
TTGGACAACT GATGATTTTG AAAAAGTATT GAAAGCACTT AAAGACAAGG GTTACACACC 
AGGTTCATTG TTCAGTTCTG GTCAAGGGGG AGACCAAGGA ACACGTGCCT TTATCTCTAA 
CCTTTATAGC GGTTCTGTAA CAGATGAAAA AGTTAGCAAA TATACAACTG ATGATCCTAA 
ATTCGTCAAA GGTCTTGAAA AAGCAACTAG CTGGATTAAA GACAATTTGA TCAATAATGG 
TTCACAATTT GACGGTGGGG CAGATATCCA AAACTTTGCC AACGGTCAAA CATCTTACAC 
AATCCTTTGG GCACCAGCTC AAAATGGTAT CCAAGCTAAA CTTTTAGAAG CAAGTAAGGT 
AGAAGTGGTA GAAGTACCAT TCCCATCAGA CGAAGGTAAG CCAGCTCTTG AGTACCTTGT 
AAACGGGTTT GCAGTATTCA ACAATAAAGA CGACAAGAAA GTCGCTGCAT CTAAGAAATT 
CATCCAGTTT ATCGCAGATG ACAAGGAGTG GGGACCTAAA GACGTAGTTC GTACAGGTGC 
TTTCCCAGTC CGTACTTCAT TTGGAAAACT TTATGAAGAC AAACGCATGG AAACAATCAG 
CGGCTGGACT CAATACTACT CACCATACTA CAACACTATT GATGGATTTG CTGAAATGAG 
AACACTTTGG TTCCCAATGT TGCAATCTGT ATCAAATGGT GACGAAAAAC CAGCAGATGC 
TTTGAAAGCC TTCACTGAAA AAGCGAACGA AACAATCAAA AAAGCTATGA AACAATAGTC 
CTTAGTTATT CTATAAAAAG TAGTTTTTTA AAGAACCTAA GAGTGTATAC CCCCTTTTCC 
CTCTACACAG ATAGTGTAAG AAAAGGGGGC TTTTGTTTAA AATGTAAGAA ACTGTCACGA 
AATTAAAATG AAGTTCTTAC ATAAGCGAAT CATAAAAAAT TTCATTTTGA TTTTAAAACA 
GTTCAAGAAA GTCAAAAAAT TATTCTATTT GAAAGAGAGG TGCCGACTGT GAAAGTCAAT 
AAAATCCGTA TGCGGGAAAC AGTGATTTCC TACGCTTTCC TAGCACCAGT ATTATTCTTC 
TTTGTCATCT TTGTGTTGGC TCCGATGGTG ATGGGCTTCA TTACAAGTTT CTTTAACTAC 
TCAATGACTA AATTTGAGTT TGTAGGCTTG GATAACTATA TCCGTATGTT TAAAGATCCT 
GTCTTTACAA AATCTCTGAT TAACACAGTT ATTTTGGTTA TTGGATCTGT ACCAGTTGTT 
GTTCTATTCT CACTCTTTGT AGCATCTCAG ACCTATCATC AAAATGTCAT TGCCAGATCC 
TTCTACCGTT TCGTCTTCTT CCTTCCTGTT GTAACGGGTA GTGTTGCCGT GACAGTTGTT 



21120 

21180 

21240 

21300 

21360 

21420 

21480 

21540 

21600 

21660 

21720 

21780 

21840 

21900 

21960 

22020 

22080 

22140 

22200 

22260 

22320 

22380 

22440 

22500 

22560 

22620 

22680 

22740 

22800 

22860 



WO 98/18931 



PCT/US97/19588 



171 

TGGAAATGGA TTTATGACCC ACTATCAGGG ATTCTAAACT TTGTCCTTAA GTCCAGCCAC 22920 

ATCATCAGCC AAAACATTTC TTGGTTGGGA GATAAAAACT GGGCATTGAT GGCGATTATG 22980 * 

ATTATTCTCT TGACCACTTC AGTTGGTCAG CCCATCATCC TTTATATCGC TGCCATGGGG 23040 

AATATTGACA ATTCACTGGT TGAAGCGGCG CGTGTTGATG GTGCAACTGA GTTTCAAGTT 23100 

TTTTGGAAGA TTAAATGGCC AAGCCTTCTT CCAACAACTC TTTATATTGC AATCATCACA 23160 

. ACAATTAACT CATTCCAGTG TTTCGCCTTG ATTCAGCTTT TGACATCTGG TGGTGCAAAC 23220 

TACTCAACAA GTACCTTGAT GTACTACCTT TACGAAAAAG CCTTCCAATT GACAGAATAC 23280 

GGCTATGGCA ACACAATTGG TGTCTTCTTG GCAGTCATGA TTGCTATCGT AAGCTTTGTT 23340 

CAATTTAAAG TACTTGGAAA CGACGTAGAA TACTAAAGAA AGGAGACAGG TATGCAATCT 23400 

ACAGAAAAAA AACCATTAAC AGCCTTTACT GTTATTTCAA CAATCATTTT GCTCTTGTTG 23460 

ACTGTGCTGT TCATCTTTCC ATTCTACTGG ATTTTGACAG GGGCATTCAA ATCACAACCT 23520 

GATACAATTG TTATTGCTCG TCAGTGGTTC CCTAAAATGC CAACCATGGA AAAGTTGGAA 23580 

CAACTCATGG TGCAGAACCC TGCCTTGCAA TGGATGTGGA ACTCAGTATT TATCTGATTG 23640 

GTAACCATGT TCTTAGTTTG TGCAACCTCA TCTCTAGCAG GTTATGTATT GGCTAAAAAA 23700 

CGTTTCTATG GTCAACGCAT TCTATTTGCT ATCTTTATCG CTGGTATGGC GCTTCCAAAA 23760 

CAAGTTGTCC TTGTACCATT GGTAGGTATC GTCAAGTTCA TGGGAATCCA TGATACTCTG 23820 

TGGGCAGTTA TCTTGGGTTT GATTGGATGG CCATTGGGTG TCTTCCTCAT GAAACAGTTG 23880 

AGTGAAAATA TCCCTACAGA GTTGGTTGAA TCAGCTAAAA TCGACGGTTG TGGTGAGATT 23940 

CGTACCTTCT GGAGTGTAGC CTTCGCGATT GTGAAACCAG GGTTTGCAGC CCTTGCAATG 24000 

TTTACCTTCA TCAATACTTG GAATGACTAC TTCATGCAAT TGGTAATGTT GACTTCACGT 24060 

AACAATTTGA CCATCTCACT TGGGGTTGCG ACCATGCAGG CTGAAATGGC AACCAACTAT 24120 

GGTTTGATTA TGGCAGGAGC TGCCCTTGCT GCTGTTCCAA TCGTCACAGT CTTCCTAGTC 24180 

TTCCAAAAAT CGTTGACAGA GGGTATTACT ATGGGAGGGG TGAAAGGATA ATAGTCTGCG 24240 

AAAATCTCTT CAAACTACGT CAGCTTCACC TTGCCATACT TAAGTATTGC CTGCGGTTAG 24300 

CTTCCTAGTT TGTTCTTCAA TTTTCATTGA GTATAGGAAA ATCAATGTAT CAAGATAGAG 24360 

AAGTATATTT TATAGATTTA GAGAATATAG AGGTTATAAG TGTCTACAAA ATGGAGGGTA 24420 

TGCAGTTACT TTATGAAGTT TTGTCAGACA CTTATAAACT TAAGAATGGT TTTAGTTAAC 24480 

TATCAGAAAC GAAGGAAAGA GTATGATTTT TGACGATTTG AAAAACATGA CCTTTTACAA 24540 

AGGGATTCAT CCTAATTTAG ACAAGGCTAT GGACTATCTC TACCAACATC GTAAGGATTC 24600 
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TTTCGAATTA GGAAAGTATG ATATTGATGG AGATAAAGTC TTTCTAGTTG TTCAGGAAAA 24660 

TGTCCTCAAT CAAGCTGAAA ATGATCAATT TGAGTATCAT AAGAACTATG CAGATTTGCA 24720 

TTTGCTGGTA GAAGGACATG AATATTCGAG CTACGGTTCA CGTATCAAAG ACGAGGC AGT 24780 

AGCATTCGAC GAAGCGAGTG ACATTGGCTT TGTTCATTGT CATGAACACT ACCCACTCTT 24840 

GTTGGGTTAT CACAATTTTG CGATTTTCTT CCCAGGTGAG CCACATCAGC CAAATGGTTA 24900 

TGCAGGCATG GAAGAAAAGG TTCGAAAATA TCTCTTTAAA ATTTTGATTG ATTAAAAATA 24960 

GGATGAATTG TTTTTTTGTA AAGCTTTGAT AATACTCTAC CATGAAATTG ATCTTTGTGA 25020 

GGTAGAGAAA TGAGAATAAA ATATTTAAAA ATTGGTATCT TCTAAGTATG CTGGAAGAGC 25080 

TAGTTTCTTA GATGGACAGG GGATTACAGT TGATGAGATG GCTTGGATAA TTAGGGGCAT 25140 

TGTGAATGCA TTGATTGGTA GATACATAAA ATTAGGTACT TATGCGGCTA AGTATGGTAT 25200 

TAGTATGGCA CGCTCGATCT TAAGTAGGGT AGCTGCAACT GCAGCAGCAA GAGTAGGATT 25260 

ACTGACCAAG ATTTCTGGAT GGATTTTACG AGTAGCTGTG AATGTAGCTG ATGTATATGG 25320 

TAATTTTGCC AACAATATTG CTGCAGCTTG GGATGCATAT GATAAAATTC CTAACAATGG 25380 

TCGTATAAAC TTTTAAAATG CGAGAATGAA AGCACTTTGT ATTTTTTTAT TGAATATGTT 25440 

AGCTTGGACA GTGCTTGCAA TGATAATTCG TGGAGGGCTA GATGGATTTG ATAGGCATAC 25500 

TTGGAGTACT ATTTTAATTG CGTCGCTGTT CGGGGTATAT GATTATAAGC CCATAGATAA 25560 

AAATAGAAAA AAGTCCAAAA GAAAAAATAG ATTTGTTCAT GGTAGGGACT TATGAAAGCT 25620 

TTACTGACAA AAAAGAAAAC AGTTTACAAA GAAAAATGAT GGAGGAGCAA ACATGGCACA 25680 

AAAAGGAGTA AGCCTTATCA AGGCAGCATT TGATACAGAT AACTTTCTCA TGCGTTTTAG 25740 

TGAGAAGGTC TTGGACATCG TGACAGCCAA TCTTCTTTTT GTCGTCTCTT GTTTACCCAT 25800 

CGTGACGATT GGAGTGGCTA AAATCAGCCT CTACGAGACC ATGTTCGAAG TTAAGAAGAG 25860 

CAGACGGGTG CCTGTTTTTA AAATCTATCT AAGATCTTTC AAGCAAAATC TGAAACTAGG 25920 

TCTTCAGCTG GGTTTAATGG AGTTAGGAAT TGTGTTTCTT ACCCTTTCAG ATCTCTATCT 25980 

TTTCTGGGGT CAAACAGCTC TGCCCTTCCA ATTGCTGAAA GCCATTTGTT TAGGTATTCT 26040 

GATTTTTCTT ACTATCGTGA TGCTGGCTAG TTACCCTATC GCGGCACGTT ATGACCTATC 26100 

TTGGAAAGAA ATTCTTCAAA AAGGATTGAT GTTGGCTAGT TTTAACTTTC CTTGGTTCTT 26160 

CCTCATGTTA GCCATTCTTG TCCTCATTGT GATGGTTCTT TATCTGTCCG CCTTCAGTCT 26220 

ACTCTTAGGT GGCTCAGTCT TCCTACTTTT TGGGTTTGGA CTATTGGTCT TTATCCAGAC 26280 

TGGATTGATG GAGAAAATTT TCGCAAAATA CCAATAGGAG CTTTATTTCT GAAACTACTT 26340 

TCAAAGGCTC CAAACGCTAT TCTATAAGCG AGAAACTAAA ATCGG 26385 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2716 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



CCTGCCCGCA TTGCCCTAGG CATTAAGTAA ACATATAAAA GCATGTGAGA GACTGTTGGA 


60 


AAAGCGAGGA AATTTCCCCT CTTTTCCTCT AGTCTCTCCT TTCTTTTGCT GATTTTATTC 


120 


AAAGAAAATG 


ATATAATAGT 


AGTTATGGAG 


AAAAAGAAAT 


TACGCATCAA 


TATGTTGAGT 


180 


TCAAGTGAGA 


AAGTAGCAGG 


ACAGGGAGTT 


TCAGGTGCTT 


ACCGTGAATT 


AGTTCGTCTT 


240 


CTTCACCGTG 


CTGCCAAGGA 


CCAATTGATT 


GTTACAGAAA 


ATCTTCCAAT 


CGAGGCAGAT 


300 


GTGACTCACT 


TTCATACGAT 


TGATTTTCCC 


TATTATTTAT 


CAACCTTCCA 


AAAGAAACGC 


360 


TCAGGGAGAA 


AGATTGGCTA 


TGTGCATTTC 


TTGCCAGCTA 


CACTTGAGGG 


AAGTTTGAAA 


420 


ATTCCATTTT 


TCTTAAAGGG 


AATTGTGAAA 


CGCTATGTAT 


TTTCTTTTTA 


CAACCGGATG 


480 


GAGCACTTGG 


TTGTGGTCAA 


TCCTATGTTT 


ATTGAGGATT 


TGGTAGCAGC 


TGGTATTCCA 


540 


CGTGAAAAAG 


TGACCTATAT 


TCCTAACTTT 


GTCAACAAGG 


AAAAATGGCA 


TCCTCTACCA 


600 


CAAGAAGAGG 


TAGTCAGACT 


GCGCACAGAT 


CTTGGTCTTA 


GTGACAATCA 


GTTTATCGTA 


660 


GTAGGTGCTG GGCAAGTTCA GAAACGTAAA GGGATTGATG ACTTTATCCG TCTGGCTGAG 


720 


GAATTGCCTC 


AGATTACCTT 


TATCTGGGCT 


GGTGGCTTCT 


CTTTTGGTGG 


TATGACAGAT 


780 


GGTTATGAAC 


ACTATAAGAA 


AATTATGGAA 


AATCCCCCTA 


AAAATTTGAT 


TTTTCCAGGC 


840 


ATTGTATCGC 


CAGAGCGGAT 


GCGCGAATTG 


TATGCTCTAG 


CGGATCTTTT 


CTTGTTGCCT 


900 


AGTTACAATG 


AGCTCTTTCC 


TATGACTATT 


TTAGAAGCTG 


CGAGTTGTGA GGCTCCTATT 


960 


ATGTTGCGTG 


ATTTAGATCT 


CTATAAGGTG 


ATTTTGGAGG 


GAAATTATCG 


GGCGACAGCG 


1020 


GGTAGAGAAG 


AGATGAAAGA 


GGCTATTTTG 


GAATATCAAG 


CAAATCCTGC 


TGTCTTAAAA 


1080 


GATCTCAAAG 


AAAAGGCTAA 


GAATATTTCC 


AGAGAGTATT 


CTGAAGAGCA 


TCTGTTACAA 


1140 


ATCTGGTTGG ACTTTTATGA GAAACAAGCC GCTTTAGGGA GAAAGTAAAA AGTGAGGTAA 


1200 


TCTATGCGAA 


TTGGTTTATT 


TACAGATACC 


TATTTTCCTC 


AGGTTTCTGG 


TGTTGCGACC 


1260 


AGTATTCGAA CCTTGAAAAC AGAACTTGAA AAGCAGGGAC ATGCTGTTTT TATCTTTACG 


1320 


ACGACAGATA AGGATGTCAA TCGCTACGAA GATTGGCAAA TTATCCGCAT TCCAAGTGTT 


1380 
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CCTTTCTTTG CTTTTAAGGA TCGTCGCTTT GCCTACCGAG GTTTTAGCAA GGCACTTGAA 1440 

ATTGCTAAAC AGTATCAGCT AGATATTATC CATACTCAGA CAGAATTTTC TCTTGGCCTG 1500 

TTGGGGATTT GGATTGCGCG TGAATTGAAA ATTCCAGTCA TCCATACCTA TCACACCCAG 1560 

TATGAAGACT ATGTCCATTA TATTGCTAAG GGGATGTTGA TCCGGCCGAG TATGGTCAAG 1620 

TATCTGGTTA GAGGTTTCCT GCATGATGTG GATGGGGTTA TTTGCCCTAG TGAGATTGTC 1680 

CGTGACTTGC TATCTGATTA TAAGGTCAAG GTTGAAAAAC GGGTCATTCC TACTGGGATT 1740 

GAATTAGCCA AGTTTGAGCG TCCGGAAATC AAGCAGGAAA ATTTGAAAGA ACTGCGTAGT 1800 

AAACTAGGGA TTCAAGATGG TGAAAAGACG TTGCTTAGTC TTTCGAGAAT CTCCTATGAA I860 

AAAAATATTC AAGCAGTTTT AGCAGCCTTT GCTGATGTTC TGAAAGAGGA AGACAAGGTT 1920 

AAACTGGTAG TAGCTGGGGA TGGCCCTTAT CTGAATGACC TCAAAGAGCA AGCCCAGAAC 1980 

CTAGAGATTC AAGACTCAGT CATCTTTACA GGGATGATTG CTCCTAGTGA GACGGCTCTT 2040 

TACTATAAAG CGGCGGATTT CTTCATTTCG GCATCGACAA GCGAAACGCA AGGTTTGACC 2100 

TACTTGGAAA GCTTAGCCAG TGGAACACCT GTCATTGCTC ACGGAAATCC TTATTTGAAC 2160 

AACCTCATCA GTGATAAAAT GTTTGGAACC TTGTACTATG GAGAACATGA TTTGGCTGGT 2220 

GCTATTTTGG AAGCCCTGAT TGCAACACCA GACATGAACG AGCATACCTT ATCAGAGAAA 2280 

TTGTATGAGA TTTCAGCTGA GAACTTTGGG AAACGAGTGC ATGAGTTTTA TCTGGATGCC 2340 

ATTATTTCAA ATAACTTCCA GAAAGATTTG GCTAAAGATG ATACGGTCAG TCAGCGTATC 2400 

TTTAAGACAG TTTTGTATCT TCAGCAACAG GTGGTTGCTG TACCTGTAAA AGGATCTAGA 2460 

CGCATGTTGA AGGCTTCAAA AACACAGTTG ATCAGTATGA GAGACTATTG GAAAGACCAT 2520 

GAAGAATAGA AAGAGGAACA GCTATGAAAA AAACAATTAA TGAGAAGCGG TCGTGATAAA 2580 

AAGATTGCGG GTGTTTGTGC TGGGGTGGCC CATTATCTGG ATATGGATCC GACTATCGTT 2640 

CAAGTCATTT GGGGTGTTCT TACTTGCTGT TACGGAGCTG GAATTGTAGC TTACATTATT 2700 

TTATGGATTA TCGCGA 2716 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13926 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
CTTTGGTTTT GCCTTATTCA AGACATGAGG GCCATCAGGA ATGATCTGAA ACTGCGAATC 



60 
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TGTTAACAGT CTATGGAGAG CTTTCATAGA ACTAAGATTC GGTTTATCTT TGCTGCCACA 120 

AATTAGTAAG GTTGGATAAG GGTAAGTTCC TGCTATATCC GTTAAATCAA GTGTCTTCAA 180 

CTCCTCAGAA ACTCCGACCA TAAGAGTCTT GTCTGCTCCC TGTTTTTCAA ATACTCTTTT 240 

GGGAAGTAGT TTAAAAATCA GCAATTGAAG ATAAAATAGG ATATTCCCTG CTAATTTAAG 300 

CGGGCATCCT GACAGAATCA AAGCTCGAAG ATTTGGTAAA TCGTAACTGG AAAGTTCTAG 360 

TGTCAGGGCA GCACCTAAGG ACAATCCAAT CAAAACAAAA GGTTCTGTCT CTTGAGCTAG 420 

GTGCTGATAA ACTCGCTCTT TAGCTTGTTG ATAGTTACTA ACTCCAGAAG GAAATAACTC 480 

GATAGCCTCA GAAGGATAAT CTGTCAGTAG ATTCCGAACT TCTTTCCAAG ACTCTGCTGA 540 

CTGCCCTAAC CCATGCAAAA ATATTAATTT CATCTAGTTC TCCTCAAGGC TTAATTCATA 600 

CAAGCCTCTC ACTGCATTAC AGCCGTAAAT AGCTTCTGCT TGGGTTAAAT CTGCCAAGGT 660 

CAAGACTTTC TCTTCTACCT GTCCTGTTTC TAGCAAATGC TGACGGTAAA TTCCTGGCAA 720 

GATTCCAAGT CGGATAGGCG GTGTGTAGAG TTTTCCAGCG ATTTTCAGAA CCAAATTTCC 780 

TATAGAGGTT TCAAGCAGTT CTCCTGACTT ATTGTGGTAA ATCTTCTCTT GTTCTCCTAG 840 

GCTCAAATGC GGTCGGTGAG TGGTTTTAAA GTAGGTAAAG GATTGATTCA AAGCAGCTTC 900 

CTGAAGACAG ACTTGGGCCT GACAAAAGCT TGTACTGAGA GGGGTTAATA CTTGACGATT 960 

GACTTCTATC TCTCCAGATT TGCTAAGGCT GATTCGCAAG CGGTAATCTC GATTAGCTTC 1020 

ACAATCCTGA CACTCTTCCT CAATCTTGTG TCCCAAGTCT TCTGCATCAA AAGGAAAAGC 1080 

AAAATAACGA CTAGCTTTTC TCAGCCTTTC CAGATGTTGT TCTTCAAACA TCAGTTGTTT 1140 

TTGGCTGATT TTTCCAGTTG TAATTAATTG GAAGCGAGCT TGTTTACGAT AGAGAACTGC 1200 

TGCCTTTTGA TGAACCTCTC GGTATTCAGA TTCCCATGTG CTATCCCAAG TAATCCCTCC 1260 

GCCAACTCCA TAAATGGCTT GACCTTTGTG AAGTTGAATG GTACGAATGG CCACATTAAA 1320 

AATCCGTCGT CCATTTGGAA GCAAGAGACC AATCGTTCCA CAGTAGACTC CACGCGGTTG 1380 

AGGCTCCAAG TCCTTGATAA TCTCCATTGT CGCAATTTTC GGTGCACCCG TTATGGAACC 1440 

ACAAGGAAAG AGTGAGCGGA AGATTTCAAC AAGGTCCACA TCCTCTCGCA ACTGACTCTT 1500 

GATGGTCGAA GTCATCTGCC AAACAGTTGA ATACTGCTCT ACCTGACACA GACGCTCCAC 1560 

GTGCTCGCTC CCAACTTCAG AAATACGGTT CATATCATTG CGCAAGAGGT CCACAATCAT 1620 

CATATTTTCA GAGCGATTTT TGGGATCCTG TTCCAACCAA CTGGCCTGTT CAAGATCTTC 1680 

TTGGTCAGTT ACCCCACGCT GAGTCGTCCC CTTCATTGGT CGTGTTGTCA ACTCGCGATC 1740 

ATTTTGCTCA AAAAAGAGCT CTGGGCTCAT GGAAATCACT GTCATCTCGT CATGTTGCAC 1800 



WO 98/18931 



PCT/US97/19588 



176 

ATAGGCATTG TAGCCCGCCT CCTGCTCTAC CACCATACGA TTGTAGATGG CAAAAGGATT I860 

GGCATTTAAC TTTTGCTTAA GTTGGACGGT GTAGTTGACC TGATAGGTAT CTCCCTGCCG 1920 

TAAATGATGG TGAATTTGGG CAATGGCCTT TTCATAGTCT GCTGCAGACG TTACTTCCTG 1980 

CCAATTTGAG GGCAAATCAA TATCCTCATA AGTCAGAGGA ATAGGGGAAG TTTCTACGAT 2040 

ATCATGAACA GTAAAGTAAA GCAGGTACTC TCCCAGTAGG GGATCCTTGT GAACTGCTAA 2100 

TTTTTCCTCA AAAGCAGGTG CAGCCTCGTA GCTGACATAC CCCACCACAT AATAACCTTG 2160 

CTCTTGGTAG CTTTCCACTT GTGCCAGCAA ATCTGCCACT TCTTCTACAT TTGTCGTTTT 2220 

CAACTCTTTA ATAGGCTGGG TAAAGGTATA TCTCTCCCCC AAAGTCCTAA AATCAATCAC 2280 

TGTTTTTCTA TGCATACCTT AAGTATAGCA TAAAATAAGA AAACGCTCAT CCGCAAAGCA 2340 

GATGAGAGAT TTCAATTATT TAAAGATTGA AGTTTTAAAG CTATTTGTTT GTTGAAGAAG 2400 

TTTCTTATAA ACAGCTTCTT TTAATTTAAC TGTATTATTC ATAGATACTG TTTTATTACC 2460 

GTTTGCTTCT TGTTTAAGAG TTTCGGCATC TTTTTTAACA GCTTCTTTAA ACAATGTCAG 2520 

TAAATCATCG TATGATGAAA CGGAAGAACC ATTTACTTCG AATGTTGTTA ATCCTTTCGT 2580 

TGCTTTATCT TTAACTTCTT TGAAGTAAGC TTTTTTAAAT TCTTCAATAG TATTAAATGT 2640 

ATTGTTAGAT ATTTTCTTGA TAATATATTC ATCACTTAGA ACAGACTCAC CATCTGTTTT 2700 

AGATTGTTGT TTATATTTAT TTGAAGCATA ACCTAAGAAC CCATTTTCGT ATCCGTAGTA 2760 

ACCCCATAAT CTAAAAGCAT TATGTTTGAA TGAAACAGCT CCAGGAGCAC CTTTACTAGT 2820 

ATTACCTCCG TAGATACCGG TCATCATTCT AACACCTACA TAAGGTGATT GATCGTTATA 2880 

GCTAATTGCT TCGGGTTTAT AGATACCATT ACCTGGATTG CGATTAGTCA TTAATTGTTG 2940 

ATCAACTAAA TCATTAACAG ATTGAATATT TAATTCATTT TTCTCTTCTT G ACTT AG ATT 3000 

TCGAATTTTA TCCCATTGAT TTAATTTATT GTTATCACGG TATTCTCTAT CTATTTTTTT 3060 

GAACCATGCA CTATTTAAAT CTTTATTTTG TTGAGAAATC ACAGATTCAG CCTCAATTTC 3120 

ATCAAGAAGA GTTAAAGTGT CATTATAACC CTTCATATAT CTATTAATAT CTTCTCGTGT 3180 

TTTTAGAGTT TTTGGATCTG TAATATACCA CTGATTCCCA TCATTTTTGC GTTTAAATAC 3240 

CATATTAATA CCTAAAGAAC CAAACTCATC AAATCCACTA CCAGTAACAG GAGTTTGTAG 3300 

CATACCCTGA GCATATGCTT CAGCATCAGT ACCTTCACGG TGTCCAAAGC CACCTAAGTA 3360 

AATCGCACGG TCGTTGACGT GTGTTGTTTC ATGTGTGTAA ACTGAAATAC CGTATTCACC 3420 

AACCATTTCT AAATGAACAT ATTTTACATC AGTTCTAATA TCATCAGAGT TAGGATATAT 3480 

AGCAGCATAA GCTCCTGTTC CATTATAATT ATAATACTTA TCCATAGGAC CAAAGAATTC 3540 

TCTAAGAGGA GTATATACTT TGTCGGTATT ATAGCGGCCA TATTTTTCAA CCCATCCACC 3600 
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AGGAGCGTTA 


TAACCTTCCC 


AAATAGGAAT 


AACAGCATCT 


CTTAGTAGTC 


GTTGTTTAAC 


3660 


GTTATCAGAC GCTAGACGAT 


ACCAGAAATC 


ATAATAGTTT 


CTATAACCAT 


CTGCAGCTTT 


3720 


GTTAACGATA TCTTTAATAT 


CTTCTAATGA 


TTTTTTACCT 


AATCGCTCTG 


CACTACCAAA 


3780 


GGCAATTGCA 


TTATAATTTG 


AAATTAAATA 


AAGATGTGCT 


TTATCAATAT 


TCAGTAGTGG 


3840 


GAGTATAGTA TTTCTAAGGT 


GACTTCGTTT 


TAAATTATCG 


AATGCACGAT 


GTTTAGAATT 


3900 


TTTAATTTCT TCGACCTCAG 


AAGCGCGTTC 


TGCGATGTAG 


ACATGGTCTT 


CTGTAGCATC 


3960 


AATAAACCAA TCGTTCATAT 


TGTCTATATT 


TGTGAACAAT 


TGTCTATTAT 


AATTTAAAAA 


4020 


TGCATCTAAA 


TTACCTGATT 


TAGTATATTT 


AGCCAATACT 


TGACCGAATG 


CGTCGAATGT 


4080 


ACGTGAACCT 


TTAATGTTGT 


TCTCTTTAGA ACCGATTTCA 


ATTAATCTGT 


CTAATACGCT 


4140 


AACTTTTTCA 


CCATAGAAAT 


CTGGTTTGAA 


TAGCATTAAT 


TCTTTAATAT 


TAACATCACC 


4200 


AAATTTAACT 


CCATAGTAAC 


GATTTAGGTA 


AGTTAAACCT 


AGTAATAAAG 


CTGCTTTGTT 


4260 


TTTCTCGACT 


TTATCACGAA 


TCATTTGACG 


AGCAGCTGGA 


GAATCATTTA 


GTTGATGTTC 


4320 


TTCGTTTTGA 


ACTAATTTTG 


TGATTAGGTT 


TGTTAAGTTT 


TCTTTAACAT CTGTGAAGCT 


4380 


TTCTTCTAAA 


TATAAATCTT 


TGATTGCATT 


AACTCTATAG 


TCACCTAATC 


GATTTAGATG 


4440 


CTGATACATC 


GTTTGAGACT 


GAAGCTCTAC 


TGATTCTAAA 


ATAGATTTTA 


TATCATTAAC 


4500 


AAGAGTAGTG 


TTATCTTTTT 


GAACGATATT 


AGGTGTATAT 


TTAATTCCTA 


AGTCAGTTAT 


4560 


AGTATATTCT 


TTTACATTAC 


TTAAACCTTC 


ACTGCTAGAA 


GACAAGTTAA 


AGTAATCTTT 


4620 


TGTACGGTCC 


GCATAGTGAA 


CAATAATTTT 


ATTAGCTTCA 


TCTAGGTTTG 


TGATAAACTC 


4680 


ATTGTTGTTC 


ATCGCGGTAA 


CAGAAAGAAC 


TTCTTTAGTA 


TTTAGATGGT 


GTTCTTTATT 


4740 


TAATTTATTA CCTTGATATA 


CAATATAATC 


TTTATTGTAG 


AATGGTATTA 


ATTTTTCAAG 


4800 


ATTTTTATAG 


GCTTGGTTAT 


ATTCAGCGTT 


ATAATCTTGA 


ATACTAGAAT 


AGGCTTTTTC 


4860 


TTCATTAAGT TTTGCAAGAG 


GAGATAGATC 


ACTTTCTAAT 


TTATCAGCAG 


TAATATTGAA 


4920 


AGTAGTAACT TT AG CATC AG 


CTTGTTCTTT 


AGTTAATTTA 


GTAAATGTTT 


TAGATTTCCT 


4980 


AAATGATCTA TTACCTGACG 


AATATCCCTC 


TACCGCATAT 


AAATCTTTTA 


TATGAGCACT 


5040 


AGCATAATCA GAATCATCAA 


CGTCGTTAGA 


GCCGAATAAC 


TCCTCTCCAC 


GGATAATCTT 


5100 


AGCATAGCTG ACAGAATTAC 


TTACCGTACC 


TACAGGCCAA 


GTCTTACTTG 


CTATTGCTCC 


5160 


AACTTCTACT 


GGATTTGAAA 


CATCTATTTT 


ACCTTTTACA 


ACCGACTCAG 


TTAGGAGAGC 


5220 


TTTTGTACCA ATAAGATGGT 


CTAGAGTTAA 


TCCATAATCT 


ACTTTAGGAA 


CTAACAAGCT 


5280 


GGCGCGTGTT TTGTTTCCTG 


TAATAGTAGC 


ATCAACATAT 


GCTTTTCTAA 


CAATTCCTCT 


5340 
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ATAGTTTGTA CCTGCAATTC CCCCTGTATG AGAGCCATTT CCACTTGTAG AGTGTAGTTT 5400 

GCCAAAGAAA GCAACATTTT CAATACGAGT TCCATCATTC ATATTATTTA CAAATCCAGC 5460 

AACATTATTA CGACCTGAAA GTGTGCCTGT AATTTTGACA TTTGTAATAA CTGAAGAACC 5520 

TTTCATAGTA TTGGCTAATG ATGCAATATT ATCTTGACCA GAACGTTCTA TCTCTACATT 5580 

TTCAAAATTC ACATTATTTA TCGTTGCGTT TGTTATCACA TTAAATAATG GATGTTCCAA 5640 

TTCAGTAATA GCAAATTGTT TTCCTTCAGA ACTTAAAAGT TTTCCTGTGA ATTCTTTAGT 5700 

GATATATGAT TTTCCATTAG GAACAACATT TCTAGCGCTC ATTGATTGTC CCAGACGATA 5760 

TTCTTTTGAA GGATCGTTTT GAATAGCTTC CACTAATTCT TTGAAATTAT AATATACATT 5820 

ATCTTCGTGG ACTTTAGGTT TTTCAATATA GTGAACGTAT TCTTCTTCAA ATTTATTATC 5880 

AGCAGTTCTA GAGACTAAAT TGTCTGCGAT TGCTGTAACT TTATATACAG GTGTTCCGTT 5940 

AACCGTAGTT TCTTCTATAT TTTTAACAGC TAGTAATGTA GTTTTCTGAT TATTTGAAGT 6000 

TATTTTTAAA TAATAATTGC TCTTATCATC AGGAATAGTT GTTATCAGTG ATTCATTAGT 6060 

TTCTTTTCCA TTTTCGTATT TGATTAAATC TGTACGTTTA ATATTTTTAA GCTCAACTTT 6120 

TTTAAGATCT AATTGAATAT TTTGATTTTC TAGAGTTTCA GTTTCTTCAC CGTTACCTCT 6180 

GTCGTAAATC ATAGTTGTAG ATAGGGTGTA TTCTTTGTAG TACTCTAGGT TCTTAAATGC 6240 

AGCGCTTATA GTTTCTGTTG TTACCTTGTC ATCTGTAAGG ACTACAGTAT TAATAACTTC 6300 

TTCTCCTTTT TTCAATTCAG CTGTGATTGA TTTGATTTTT GTTTTGTTTT GATTTTCTAG 6360 

AGTATACTTA GCAACAGCTT CACGTTCCAA TATTTTCTTA TCGGTACTAG TCAATGTTAA 6420 

TATTGGCTTT TCAGATAATT CAACCAATTT TTCAATAGTT GCAGTTAATT TTTCAACAGC 6480 

TTCGTTAACT TCACTTTGTT TAGCATCTGT ATTAGCTGCA ACTTTTTCAG CCTTTGTAAC 6540 

TTCAGTTTGG AGGTTTTGCC AACTTCTATC ACTGTAATGT TCTTTTACCT TTGTTTTTGC 6600 

ATCTGCAATC GTATTGTTTA ATTCAGTTTT ATCAACGTTT AGAGCGTCAA TAGCCGTTTT 6660 

AAGTTTATTT GTCTCGCTAT TTACCTCAGG CTGTTTTACA GGCTCTGAAG CATAGACACC 6720 

TTTTGCAGTT TCTAAAACAG GTCCAAGAGC ATTGTAACTT GCTGTAGAAT AATCAGTAGG 6780 

AGAAACTGAA CTAGCTTTAT CAATTTGATT ATTTAACTCA CTTTTATCAA CTGGTTCTTT 6840 

AGTACCAATA CCCTTTATTT TATCTTCTGG TTTCGGTGTT TCCTCTACAG CCTTCTCTTC 6900 

TTCAGGAACT TCTGGTTGCT TTTCTGGCTC AACTGGTGCC GTTGGTGCCT GTTCGTCTTC 6960 

TCTTGGCGCG ACTGGTTCAC CTGCTTGTTC AACTTTTGGT TCCTCTGTTG GTTCTGTTTG 7020 

TTTTTCTACA GCAGGCGTTT CAACTTTTGG TTGTTCAATA GATTGATTAA CAGTCTCCTC 7080 

TTTTGGTTCT ACAGTTTCTT CAGCCTTGGT ATCTGGAGTT GACTCTTCTT GTTTCGGTGT 7140 
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TTCCTCTACA GCCTTCTCTT CTTCAGGAGC TTCTGGTTGC TTTTCTGGCT CGACTGGTGC 7200 

CTTTTCGTCT TCTCTTGGCG CGACTGGTTC ACCTGCTTGT TCAACTTTTG ATTCCTCAGC 7260 

TGGTTTGTCT GATGGTTGAC TTTCTGGCTT AACTGCTACT TTTTCCTCTG GTTTTGACTG 7320 

AACTTCTCCA CCTACTTCTT CAACTGGAGC TGGTTCTGCT GAATCTTCTT TCCCCTGTTG 7380 

TACTTTAGGA AGGGTGTCGT CAGTAGGTTT TACCTCCGAT TTTGGTTCTT CCTTTGGACT 7440 

TTCTTCTGTT TTAGGTGCTT CTTCTTTTGG AGCTTCCTCT GTCTGTACTA CTTGGTTTTG 7500 

TGTCCTAGCT TGCTCCTGAT TTGTTATTGA TTGAGGAGTC TCAACTTGGA CCACAGTCAC 7560 

CTCTCCAGGT TTTGCTGAGG TTTCTTCTAA AACAGTGTCC AAGGCAAGCG TTTTGAGGAT 7620 

GTCACCTGAT AGATAACCAA CATAGCGATA GCCCTCCATT TCAACAACAG CCTCTCGACT 7680 

AGCCAGCGCT AGGGTCGCAA CTGGGTCTAC AGCCCCTGCA CTAGGAAGAA CTACCAATCC 7740 

CATAGCTCCA ACTAGAAAGA CGCTAGCAAT TTTCTTTCTC TTGTAGATTA AAAGCAAGCT 7800 

CCCAACAGTC AGCAAACCAA AAGCTGTCAA AACAGATGCT TCTGTCCCTG TTTGAGGCAA 7860 

CTGATCTTTT TGATACACCA AACCATATAC AACTTCATTC CTGTCAGGGT TTGCTGTCTG 7920 

AATTAAATCT TTAGCTTCTT GTGAAATAAT CTCTTTATTT ACATAGTGAT AGGTGGCTGC 7980 

GTCCACTACA GAAGGAGCCA . TCAAAAGGCT TCCAAGAAAT ACAGAGCCTA CAACTCCCTT 8040 

AATCTTACGA ATTGAAAAAC GGTCTTTTTT AAACACTTTT ATCTCCTTTA TTCATTCTCA 8100 

AAACTTCCTA ATAGCATCTT GCGGATAGTG CGCACGCGCA CCTCCGATTA ATTTTGGACG 8160 

ACTAGCCAGT GCCGTTACAT GGGCATGACC AATCTCTCTC AAAATAGGGC GAATCGGAAC 8220 

CTGAACATGC TTGACATGCA TGCCAATTGC AGTGTCTCCG ATATCCAATC CAGCATGAGC 8280 

CTTGATAAAT TCAACCTCAA CTGGATCCTG CATAAACTTA AAGGCTGGCA ACTGGGGGGA 8340 

ACCTCCTGCA TGAAGAGTAG GATGGACACT GACAATTTCC AGACCAAACT GCTCTGCCAC 8400 

GTGACGTTCA ACAACGAGAG CCCGATTGAC ATGCTCACAA CCTTGAACTG CTAAATGGAT 8460 

ACCTCTACTA CCTAGAATAT CCAAGATAGT CTCCACTATC AGCTCACCAA TCTCTTGACT 8520 

GGATTCTTTC CCAATATGAC CACCTAGCAC CTCACTAGAA GATAGACCTA AAACAAAAAG 8580 

GGCCCCCTGC TTCAAATTGG TCTTTTCTAA AACATCTTCC ACTACCTGAC GTGTTTCTCT 8640 

TTGAATCTGT GTCTCGTTCA TCTCTGTTAC CTCTGTTGTC ACTCTTCTAT CATACCGTTT 8700 

TTTCTTGTTT TTAGCAAGAT AGACAACCTA GAAAGTTTGC CCAATTACGG ATAAAACTCC 8760 

CAGAATTGAC TGGGAGTTAG CTAGTTTCTA TTCTATTTAT ATATATTTCA ACTTTCGTGC 8820 

CTTTTTGGGG TCTAGAATCA ATCTTCATAT GGTAATTGGC TCGAAAATGA AGTTTGAGGG 8880 
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GTTGATCGAC ATTTTGAAGA CCAACTCCCC CACGTTTGAG TTGACTTTGA CTACTATCAC 8940 

CAGCATCTTG GAAGCCAACG CCATCATCCT CAATACGGAT GACCAATCCC GAATCCTGTT 9000 

TCTGGACAGA AAGTTTAATA TGGCCCTGAC CTTCCTTTTC CTTAATGCCA TGGTAAAGAG 9060 

CATTTTCTAC AAGGGGTTGT AGGACCAGCT TGGGTAAGAC TAAATTATCA AAGGCAACAT 9120 

TTTCATTAAT TTCGTATTCC AGCTTATCTC CATAGCGTTG TTTCTGGATA AAGAGATACT 9180 

GGCGGACATG ATTGATTTCG TCAGAGAGAC AAATCAAGTC CTTGCCTTGA TTGAGCGCCA 9240 

AGCGGAAATA GGTTGCCAAG GACTTGGTCA CCTGCACCAC TCGCTGACTA TCATGAAATT 9300 

CAGCCATCCA GATGATGGTG TCCAAAGTGT TATAGAGGAA ATGTGGATTA ATCTGGCTCG 9360 

AAAGGGCTTG AAGTTGGTAC TGACGGGTCG TTTCTTCCTG GCTACGAATA GCTACCATCA 9420 

ACTGATCAAT CTGATCCAAC ATAGCATTAA ATTGGCGAGT TACTTCTCTC AGTTCATAGG 9480 

CACCAACTTC CTTGGCACGA AGATTTTGAG CACCAGAAGC AATTTCCAAC ATGGTTTCTC 9540 

TCAAATCCTT CAAAGGAGCA ATCCAGCGTT TAAGACTGAA CCACACTAAG CAGAGACAGA 9600 

CAAGAAGAGA TGTGACACTG GCCCCAAGCA AGGTCCACAA GAGCTGACTC CGAACCTGGT 9660 

CTAACTTTTC CAATGATGAG ACGCCAAGCA CCGTCCAATC AGTTCCTGCA ATCTTCTCTT 9720 

GACTGACGTA GGATTTGTGA CCAGGAGTAT AACCCTGACC TGTATCGATG TAGGGTTTCA 9780 

TAGCCTCCAT TTTGCTAGAC GAACTATAAA CTGTGTGTTG AGGATGGTAG ACAAATTCAT 9840 

GGTTTTCATT GATAATGAAG GCAAAGCCCT GCTGCCCCAA CTGGAGTTGA TTGAGATAGG 9900 

CTTCCAGAGT TTCATAAGAA ATATCCAAAC GAAGCACACC AAGATTGGCT CCCTTTGCAT 9960 

CAACAAGTTC TTGAGTGACA GAAATGACCC ACTGACTATC TGATTTACGA GCTGGAGTCA 10020 

AAACAGGCAT AGCTCCCTGA TGAATGGCCT TTTGGTACCA ATCCTCAGCC ATCATATCAG 10080 

AGGAAGTTTT CATCTGCACA CTGTCATCTG TAGAAATGAC CTGACCAGAT TTGGTCACCA 10140 

GCACAACAGT TTTCAAGTCC TTATCTGACT TCAAGATGGT CAAAAACAAA TCTCGGATTG 10200 

CCTCGACCTT GTCTTGACTG GGATTCTCAG CATAGGCCAG AACATCCGTC TGCTGGGTCA 10260 

AACCAGTCGA GGTGGTTTCT AGTTTTTTGA TATAAGACTG AATAAAGTGG CTAGTCTGGC 10320 

TGATGGTCGT TTGGCTGTTG CCCTCAATGG TGGCCTCAAT GGCTGAAGAA CTTGATTGAT 10380 

AGTAGAAAGT TCCAACCAGA GCTAGGAGAA TGAGAAAGAC CAGAAAGATG GAAATAACCA 10440 

TTCTAACTAA AAGAGAAGAA CGCTTCATCG GTCTTCTCCC TTCTTAAACT GACGAGGTGT 10500 

CACACCTGCA ATCTGCTTAA AACGTTGGGT AAAATAGTTC ATATCTTCAA AACCAACCTT 10560 

CTCTGCGATC TCATAAATCT TCAGATCTGT AGTTAAAAGC AAGAGCTTGG CTTGTTTAAC 10620 

ACGTTCTCTC ACCAGATAAT CCTGAAAAGG CAAGCCCAAC TCTTTCTTAA TCAAGGAACT 10680 
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CAGATAGGTC GGACTAAAAC CTAAGTCACT GGCTAAAGAC TTTAAACTAA ATTGGCTATC 10740 

AGCCAGATGA GACTGGATTT TCTGGGCCAT GTTTCCTTCA AACCTATTAG TCAATAAATC 10800 

TTGTAACTGC TCTTCTTTCT CTTCCTTGTC TAGTTTTTGT TTGATTTTCC CCAACATTTC 10860 

CTCAATATCC TGACGAGAAA AGGGTTTGAG CAGGTAGTCG TCCACACCTA GTTTGACAGC 10920 

AGACAAGGCA TAATCAAAAT CATCGTAACC TGTTAAAAAG ACCAAATGAA CCTGAGGATA 10980 

GGTTTCTCGT ACCAGACTGG CCAACTGGAT GCCATTTAGA TGAGGCATGT TGATATCGGT 11040 

TAAAATGATA TCTGGCACCT GCTTTTGGAT CAATTCCCAA GCCTGCCTTC CATTTTCAGC 11100 

CTGACCGATG ATTTCCATAT CGTAGGCTGC TACATTGACC AGTTTAGTCA AACCTTGTCT 11160 

TACCAGATAT TCATCTTCTA CGATTAAGAT TGTGTAGGTC ATGCTCTGCT CCTTTACCAC 11220 

TTACTAGTAT CAGTATAGCA AAATTCTCCT CTAACTGCTT AGGAAAGACC TCTTATACTC 11280 

AATAAAAATC AAAAAGTAAA CTAGGAAGAT AGCCACAGGT TTCTCAAAGT ACCGCTTTGA 11340 

GGTTGTAAAT AAAACTGACG AAGTCGACTC AAAGTATAGC TTTGAGGTTG TAGATAAAAC 11400 

TGACGAAGTC GATAACCCTA CATACGGTAA GGCGACGCTG ACGTGGTTTG AAGAGATTTT 11460 

CGAAGAGTAT TAATCAACAT AATCTAGTAA ATAAGCGTAc CTTTTTCTTC CATTTGGTCT 11520 

TTGGGAATAA AGCGGATAGA GAGGCTATTG ATACAGTAAC GTAAGCCGCC CTTGTGCTGT 11580 

GGACCATCCG TAAAGACATG CCCAAGGTGA GAATCTCCTA CTCGGCTCCG CACTTCCATA 11640 

CGCGTCATAT TGTAGGACTT ATCTTCCTTG TAGGTGACAA CATCTGGACT GATGGGTTGG 11700 

GTAAAACTAG GCCAGCCACA ACCAGACTCA AATTTGTCTT TTGATGAAAA GAGAGGTTCC 11760 

CCAGTTGCTA TATCCACATA GATACCGGAT TCAAATTTAT CCCAGTAACG GTTTGAGAAA 11820 

GCTCGTTCTG TTTGATTTTC CTGGGTAACT GCATACTCCT CAGGTGACAG GGTCTTTTTC 11880 

AATTCCTCAT CACTTGGTTT TGGATATTTG CTGGCATCAA TGACAGGATA GGCCGCCTGA 11940 

TTAACATTGA TATGGCAGTA GCCATTTGGA TTTTTCTTGA GATAGTCTTG ATGGTAATCC 12000 

TCAGCCACCA CAAAATTCTT CAAGTTTTCC TTTTCAACTG CTAGAGGTTG ATCGTATTTC 12060 

TTAGCCACCT CATCAAAGAC TTGGTTAATC ACTTCCAAAT CCTTGTCATC TGTGTAATAA 12120 

ACACCAGTAC GGTACTGGGT CCCCACATCA TTTCCTTGTT TATTTTTGCT GGTTGGATTG 12180 

ATAATGCGGA AATAGTGAAG CAGGATTTCC TTGAGAGAAA TTTGCTTGGC ATCATAGGTG 12240 

ACATGGACGG TTTCTGCATG ACCTGTTTGG TTAATCAATT CGTACTTGGT TGTTTCTCCT 12300 

CTACCATTTG CATAGCCTGA AACGGCATCC GTCACCCCGG GAACAGGTGA GAAATATTCC 12360 

TCCACTCCCC AGAAACAACC TCCAGCTAGA TAAATTTCGT GCAAGTCTGC GTCTTTACTA 12420 
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ATTTCTGTTT 


TTTTCACTGC TTTTCCTCCT TGGCTAACTG CCGCCTTTTC 


AATTTGCGAG 


12480 


GCATCTGTCT 


GCCCTGCATT TCGTATCAAT AGAACATAGA AACCGGTTAT 


GGCTAGAAAA 


12540 


AATACTCCTA GCAACAAGAA GATTTTTAAC TTATCATTCA TAAGACGCCT 


CCTAGGCTAA 


12600 


TTCCTTCAAA 


GTTTGCAAAA TTGCATCTTT TTCCATGAAT CCTGGATGTG 


TTTTGACCAG 


12660 


CTTGCCTTCT 


TTGTCTATAA AGGCTTGGGT TGGGTAAGAA CGGACACCAT 


AAGTTTCCAA 


12720 


AAGTTTGCCT 


GATGGGTCAA CTAGGACTGG GAGATTTTTA TAATCCAATC 


CCTTATACCA 


12780 


ATTCTTAAAG 


TCCGCTTCAG ATTGCTCTCC CTTATGTCCT GGTGACACTA 


CTGTCAAGAC 


12840 


CACATAGTCA TCACCAGCTT CTTTAGCAAT CTCATCCGTA TCTGGAAGAC 


TAGCCAGACA 


12900 


GATGGAACAC 


CAAGAAGCCC AGAATTTGAG ATAGACTTTC TTGCCCTTGT 


AATCAGATAA 


12960 


ACGGTAGGTC 


TTGCCATCTA CTCCCATCAA TTCAAAATCA GCCACCTCTT 


TCCCTTTAGC 


13020 


TGCGCTTGTT 


TTACTAGCTG TCTGCTCCGT CTTCATTTCA TCTTTCGTTT 


GGTGTTCACT 


13080 


AGTCACGGAC 


TTGCCTGAAC AAGCCGTCAA ACAAAGGAGC GAACCTGCTC 


CAAGAACAC A 


13140 


TGTTTGCCAT 


TTTTTCATAT TGATATTCCT TTCCATTTTA TTCAAATAAT 


TGACTTAAAA 


13200 


TTGAAGCATT 


TCCAAACAGA ACCAAGAAGC CCATCACAAT AATGAGAAAA 


CCACCCACTT 


13260 


TTTTGAGGAT 


TCCGAGATAG GGATGAAGTT TTCGGAAATG TTTCAAAACA 


TAACTAGAGG 


13320 


TCAGAGCTAG 


AAGCAAGAAT GGTAGCGCCA AGCCCAGCGT ATACACCAAC 


ATGAGACCAG 


13380 


CTCCCTGCCA 


AGCTCCTGAA CCACCTGAAG CCGCCAAGGC CAAAACAGAC 


CCCAGAACCG 


13440 


GCCCCACGCA AGGCGTCCAA GCAAAACTAA AGGTCAAGCC CAATAAAAAT 


GCCTGAGTAT 


13500 


AGCCCTTACC 


ATTTTGCCCC TGTCCTTGCA GTTGTAGCCT CTTTTCCTTA 


TAAAGCCCCT 


13560 


TAAAGTGTAG 


AATCTCCATT TGGTGCAAAC CAAGAAGGAT AATAATTGCC 


CCAGTAAGAT 


13620 


ATTGGAACCA 


AGAAGCATAA AGCAAATCGC CTAAAAAACC AGCTCCATAG 


CCCAACAAAA 


13680 


TAAATATAAA 


GGAAATTCCT GCTATAAAGG CCAGAGTTCG TAATAAACTA 


GTAACTGAGA 


13740 


TTGAAAATTT 


GCCGCTAGAA GCCTGAGCAC CATCCTTATC ATCTAGTAAC ACTCCTGTAT 


13800 


AGACCGGTAA 


CAAAGGTAAG ATACAAGGAG AAAAGAAGGA TAGAATCCCT 


GCCAAAAAGA 


13860 


CACTTAGAAA 


AAAGAAAATA TGACCCATAA AGTTCCTCCT ATCATTTTAT TGATAGATTT 


13920 


ATTATA 






13926 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20199 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CCCAGCAGAA AAATGGCATT TGGAGATAAT GGAAATCGTA AAAAAACTAT GTTTGAGAAA 60 

ATAACCTTGT TTATCGTGAT TATCATGCTA GTAGCAAGTT TATTGGGAAT TTTTGCAACT 120 

GCAATTGGTG CCCTCAGTAA TCTATAAAAT AGATTCAAGA AAATTTAGTG ACTGGGATTT 180 

CCCAGCCCTT TTTTAAAGTG AGAAGAAATA ATGAGTATGT TTTTAGATAC AGCTAAGATT 240 

AAGGTCAAGG CTGGTAATGG TGGCGATGGT ATGGTTGCCT TTCGTCGTGA AAAATATGTC 300 

CCTAATGGAG GCCCTTGGGG TGGTGATGGT GGTCGTGGAG GCAATGTGGT CTTCGTTGTA 360 

GACGAAGGAC TACGTACCTT GATGGATTTC CGCTACAATC GTCATTTCAA GGCTGATTCT 420 

GGTGAAAAAG GGATGACCAA AGGGATGCAT GGTCGTGGTG CTGAGGACCT TAGAGTTCGA 480 

GTACCACAAG GTACGACTGT TCGTGATGCG GAGACTGGCA AGGTTTTAAC AGATTTGATT 540 

GAACATGGGC AAGAATTTAT CGTTGCCCAC GGTGGTCGTG GTGGACGTGG AAATATTCGT 600 

TTCGCGACAC CAAAAAATCC TGCACOGGAA ATCTCTGAAA ATGGAGAACC AGGTCAGGAA 660 

CGTGAGTTAC AATTGGAACT AAAAATCTTG GCAGATGTCG GTTTAGTAGG ATTCCCATCT 720 

GTAGGGAAGT CAACACTTTT AAGTGTTATT ACCTCAGCTA AGCCTAAAAT TGGTGCCTAC 780 

CACTTTACCA CTATTGTACC AAATTTAGGT ATGGTTCGCA CCCAATCAGG TGAATCCTTT 840 

GCAGTAGCCG ACTTGCCAGG TTTGATTGAA GGGGCTAGTC AAGGTGTTGG TTTGGGAACT 900 

CAGTTCCTCC GTCACATCGA GCGTACACGT GTTATCCTTC ACATCATTGA TATGTCAGCT 960 

AGCGAGGGCC GTGATCCATA TGAGGACTAC CTAGCTATCA ATAAAGAGCT GGAGTCTTAC 1020 

AATCTTCGCC TCATGGAGCG TCCACAGATT ATTGTAGCTA ATAAGATGGA CATGCCTGAG 1080 

AGTCAGGAAA ATCTTGAAGA CTTTAAGAAA AAATTGGCTG AAAATTATGA TGAATTTGAA 1140 

GAGTTACCAG CTATCTTCCC AATTTCTGGA TTGACCAAGC AAGGTCTGGC AACACTTTTA 1200 

GATGCTACAG CTGAATTGTT AGACAAGACA CCAGAATTTT TGCTCTACGA CGAGTCCGAT 1260 

ATGGAAGAAG AAGCTTACTA TGGATTTGAC GAAGAAGAAA AAGCCTTTGA AATTAGTCGT 1320 

GATGACGATG CGACATGGGT ACTTTCTGGT GAAAAACTCA TGAAACTCTT TAATATGACC 1380 

AACTTTGATC GTGATGAATC TGTCATGAAA TTTGCCCGTC AGCTTCGTGG TATGGGGGTT 1440 

GATGAAGCCC TTCGTGCGCG TGGAGCTAAA GATGGGGATT TGGTCCGCAT TGGTAAATTT 1500 

GAGTTTGAAT TTGTAGACTA GGAGACTGGT ATGGGAGATA AACCGATATC TTTCCGAGAT 1560 

GCGGATGGTA ATTTTGTTTC CGCCGCAGAC GTTTGGAATG AAAAGAAATT GGAAGAACTA 1620 
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TTTAATCGTC 
TCTCAGTAAA 
AGGCATGTAT 
TGAAATGAGA 
AAGCAGAGAT 
CAAAATTAAA 
TCGGCGAGTC 
CTGGGGATAG 
CGAAATCGTG 
CATCAAACTC 
TTGAATTCGT 
ATACACGAGG 
CAGATAGAAG 
TTCTGTGAAC 
CCGATACTTA 
GGCAATAGCG 
CGCCTTGTCG 
GTTTTGAGCA 
AATTCAGTTA 
AGGATAAATT 
GAAAACTTAG 
CTATTAGTGG 
ATGTGGTGAC 
TGGAATTGAT 
GTGTTCAAAA 
TTTATGGGAG 
ATCTTGGTCC 
CTAGCTACGA 
GTATTTACAT 
AAGCAAATGG 



TCAATCCAAA 
GAAGCTAAAA 
AGCAAACTGA 
ACAGGACAAA 
GTACTATTCT 
TTGTTTGATT 
AAATAGCGAT 
ACCGTTTTAA 
GCTCTACGAA 
TAAAGTCCAA 
ACTAAGATTT 
AAAGATGTAC 
TGATCCTGAG 
TGAGAGAAGG 
GATAAGAGAT 
ATTCGAGAAA 
TATGTGTAGG 
ACcTGCGGCT 
CTAACTCGTC 
TATGATATAC 
AATGAGAAAA 
TGCTAAAAAT 
TTTGGATTGC 
GGGAGCTACT 
TATTCCAATG 
CCTCTTAGGC 
TCGTCCGATT 
GGGAGATAAC 
GGATACGGTT 
TCGTACTATT 



TCGTGCCTTG 
AATCCCGTGC 
ATCTGGAATA 
TCGATCAGGA 
AGTTTCAATC 
CTTATTTCAA 
TCCCAAGCCT 
GTCTGACGCT 
CAGGAACGTG 
AAAGGTAGTC 
TCTATTTTCA 
GACTTATCCC 
TCACGGTTAT 
GGGAGAAGTT 
CTAGTCTTAG 
GATTATACTC 
ATACTGACTA 
AGTTTCCTAG 
AACTCTGATT 
TTTATTTTGA 
ATTGTTATCA 
AGTGTCGTTG 
GTTCCAGATA 
GTTAAGCGTT 
CCTTATGGTA 
CGTTTTGGTG 
GACTTACACC 
ATGAAGTTAT 
AGTGTGGGAG 
ATTGAAAATG 



184 
AGATTGGCAC 



CTCATCAGAC 
GCACAGCATA 
CAGTAAAATC 
AACTATATTG 
TTTGTTATAG 
GACTATCGTG 
GGAAATAAGA 
ATAATAAGGC 
GTAACCTATA 
CTGTAACCTT 
GTGAGGTCTA 
CTGTCTGATA 
CTTGCTAAAA 
CTCCTAGTCA 
TTCGAAAATC 
CGTCAGTTCC 
TTTGATCTTT 
TATCCAATAA 
AGACCTTATT 
ATGGTGGATT 
CCTTAATTCC 
TTTCGGATGT 
ATGACGATGT 
AAATTAACAG 
AAGCGACAGT 
TTAAGGCGTT 
CTGCTAAAGA 
CAACGATTAA 
CAGCCCGTGA 



GAACTAAAAA 
ACGGGATTTT 
TCTTCTAAAA 
GATTTCTAAC 
TTATAAATTG 
TATATCTGAT 
AGGTAGCGGA 
ATTGTCAGAA 
GTATATAGCG 
TGCGTAAATC 
TTAACGCCCT 
TCACTATAAA 
GGACGGTATG 
TTTAGTTGAA 
GTTTTAGGGG 
TCTTCAAATC 
ATCTACAACC 
GATTTTCATT 
AATTGAAAAG 
AGAAATCTTG 
ACCACTGCAA 
AGCTATTATC 
AGCCAGTCTT 
ATTGGAGATT 
TCTTCGTGCA 
TGGTCTACCG 
TGAAGCTATG 
TACAGGACTT 
TACGATGATT 
ACCTGAGATT 



GGAAAATCCA 
GTGGTACGAC 
TATAGTAAAA 
AATGTTTTAT 
ATTTGAATTT 
GTCAAAGTTC 
TTAAAATGGT 
GAAGGGATAG 
GATAAGAGGG 
ACGAGAGTAA 
TATATGTTGT 
GAGAAAACGA 
TATAAAACGC 
CAGCCGTATT 
ATAAAAAAGG 
ACGTCAATAT 
TCAAAACAGT 
GAGTATTAGT 
GATGGAAAAA 
AAAGAGTATT 
GGTGAAATCA 
TTGGCTGATG 
GTCGAAATCA 
GACCCAAGAG 
TCTTACTATT 
GGAGGATGTG 
GGTGCCACTG 
CATGGTGCAA 
GCTGCGGTTA 
ATTGATGTAG 



1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 
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CTACTCTCTT GAATAATATG GGTGCCCATA TCCGTGGGGC AGGAACTAAT ATCATCATTA 3480 

TTGATGGTGT TGAAAGATTA CATGGGACAC GTCATCAGGT GATTCCAGAC CGCATTGAAG 3540 

CTGGAACATA TATATCTTTA GCTGCTGCAG TTGGTAAAGG AATTCGTATA AATAATGTTC 3600 

TTTACGAACA CCTGGAAGGG TTTATTGCTA AGTTGGAAGA AATGGGAGTG AGAATGACTG 3660 

TATCTGAAGA CAGCATTTTT GTCGAGGAAC AGTCTAATTT GAAAGCAATC AATATTAAGA 3720 

CAGCTCCTTA CCCAGGCTTT GCAACTGATT TGCAACAACC GCTTACCCCT CTTTTACTAA 3780 

GAGCGAATGG TCGTGGTACA ATTGTCGATA CGATTTACGA AAAACGTGTA AATCATGTTT 3840 

TTGAACTAGC AAAGATGGAT GCGGATATTT CGACAACAAA TGGTCATATT TTGTACACGG 3900 

GTGGACGTGA TTTACGTGGG GCCAGTGTTA AAGCGACCGA CTTAAGAGCT GGGGCTGCAC 3960 

TAGTCATTGC TGGGCTTATG GCTGAAGGTA AAACTGAAAT TACCAATATC GAGTTTATCT 4020 

TACGTGGTTA TTCTGATATT ATCGAAAAAT TACGTAATTT AGGAGCGGAT ATTAGACTTG 4080 

TTGAGGATTA AACCGTAGAG GTGTTTATGA ATATTTGGAC CAAATTAGCA ATGTTTTCTT 4140 

TTTTTGAAAC GGATCGCTTG TATTTGCGTC CTTTCTTTTT TAGTGATAGT CAGGACTTCC 4200 

GCGAGATAGC TTCAAATCCA GAAAATCTTC AATTTATTTT CCCAACGCAG GCAAGTCTGG 4260 

AAGAAAGTCA ATATGCACTG GCCAATTACT TTATGAAGTC CCCTTTGGGA GTGTGGGCAA 4320 

TTTGTGACCA GAAAAATCAA CAAATGATTG GTTCTATTAA ATTTGAGAAG TTAGATGAAA 4380 

TCAAAAAAGA AGCTGAGCTT GGCTATTTTT TGAGAAAAGA TGCTTGGTCG CAAGGATTTA 4440 

TGACAGAGGT TGTTAGAAAA ATTTGTCAGC TTTCTTTTGA GGAATTTGGC TTAAAACAAT 4500 

TATTTATCAT TACCCACCTT GAAAATAAAG CTAGCCAAAG AGTTGCTCTT AAGTCTGGAT 4560 

TTAGTTTGTT CCGTCAGTTT AAGGGAAGTG ATCGTTACAC AAGAAAAATG CGGGATTATC 4620 

TTGAATTTCG GTATGTAAAA GGAGAGTTCA ATGAGTAAGC ATCAGGAAAT TCTAAGCTAT 4680 

TTGGAGGAAT TACCAGTAGG TAAAAGGGTC AGTGTTCGTA GCATTTCGAA TCATCTAGGA 4740 

GTTAGTGATG GAACAGCCTA TCGGGCTATT AAAGAAGCTG AAAACCGTGG AATTGTGGAG 4800 

ACCCGTCCTA GAAGTGGAAC AATTCGTGTT AAATCCCAGA AAGTTGCTAT AGAGAGATTA 4860 

ACGTTTGCTG AAATTGCAGA AGTGACTTCT TCTGAGGTTC TGGCTGGGCA AGAAGGTTTA 4920 

GAGAGAGAAT TTAGTAAGTT TTCAATTGGT GCCATGACTG AACAAAATAT CTTGTCTTAC 4980 

CTTCATGATG GGGGGCTCTT GATTGTCGGA GACCGAACCC GTATTCAGTT GCTAGCCTTG 5040 

GAAAATGAAA ATGCAGTTCT GGTTACAGGG GGATTTCAGG TTCATGATGA TGTGCTTAAA 5100 

CTGGCCAATC AAAAAGGGAT TCCTGTTCTA AGAAGTAAGC ATGATACCTT TACCGTCGCG 5160 



WO 98/18931 



PCT/US97/19588 



186 

ACCATGATCA ATAAAGCCTT GTCAAATGTC CAAATCAAGA CTGATATTCT GACAGTTGAG 5220 

AAACTTTATC GCCCTAGTCA TGAGTATGGT TTTCTGAGAG AGACAGATAC AGTTAAAGAT 5280 

TATTTGGACT TGGTTCGTAA GAATCGTAGC AGCCGTTTCC CTGTTATCAA TCAACATCAG 5340 

GTCGTTGTTG GTGTTGTAAC CATGAGAGAC GCTGGTGATA AATCACCAAG CACGACAATT 5400 

GATAAGGTTA TGTCTCGTAG TCTATTTTTG GTTGGATTAT CGACAAATAT TGCCAATGTG 5460 

AGTCAACGGA TGATCGCAGA AGACTTTGAA ATGGTACCAG TTGTTCGAAG CAATCAAACT 5520 

TTGCTTGGCG TTGTGACGCG ACGAGATGTC ATGGAGAAGA TGAGCCGTTC CCAAGTTTCG 5580 

GCTCTACCAA CTTTTTCTGA GCAGATTGGA CAAAAGCTCT CTTATCACCA TGATGAAGTA 5640 

GTCATTACAG TGGAACCCTT TATGCTAGAA AAAAATGGAG TTTTGGCTAA TGGTGTATTG 5700 

GCAGAAATTC TGACCCACAT GACCCGATTT AGTTGTTAAT AGTGGTCGCA ATCTCATTAT 5760 

CGAGCAGATG CTGATCTACT TTTTGCAGGC TGTTCAGATA GATGATATAT TGCGCATTCA 5820 

GGCACGGATT ATTCATCATA CGAGACGGTC AGCTATAATT GATTACGATA TTTATCATGG 5880 

TCACCAGATT GTTTCAAAAG CAAATGTGAC TGTTAAAATT AATTAGAAAC TAGGAGAAAA 5940 

GATGATAACA TTAAAATCAG CTCGTGAAAT CGAAGCTATG GACAAGGCTG GTGATTTTCT 6000 

AGCAAGTATT CATATAGGCT TACGTGATTT GATTAAGCCA GGCGTAGATA TGTGGGAAGT 6060 

TGAAGAATAT GTCCGCCGTC GTTGTAAAGA AGAAAATTTC CTTCCACTTC AGATTGGGGT 6120 

TGACGGTGCC ATGATGGACT ATCCTTATGC TACCTGTTGC TCTCTTAACG ATGAAGTGGC 6180 

TCACGCTTTC CCTCGTCATT ATATCTTGAA AGATGGTGAT TTGCTCAAAG TTGATATGGT 6240 

TTTGGGAGGT CCCATTGCTA AATCTGACCT AAATGTCTCA AAATTAAACT TCAACAATGT 6300 

TGAACAAATG AAAAAATACA CTCAGAGCTA TTCTGGTGGT TTAGCAGACT CATGTTGGGC 6360 

TTATGCTGTT GGTACACCGT CCGAAGAAGT CAAAAACTTG ATGGATGTAA CCAAAGAAGC 6420 

TATGTACAAG GGTATTGAGC AAGCTGTTGT TGGAAATCGT ATCGGTGATA TCGGTGCGGC 6480 

TATTCAAGAA TACGCTGAAA GTCGTGGTTA CGGTGTAGTG CGTGATTTGG TTGGTCATGG 6540 

TGTTGGCCCA ACTATGCACG AAGAACCAAT GGTTCCTAAC TATGGTATTG CAGGTCGTGG 6600 

ACTCCGTCTT CGTGAAGGAA TGGTCTTAAC CATTGAACCA ATGATCAATA CAGGCGATTG 6660 

GGAAATTGAT ACAGATATGA AAACTGGTTG GGCGCATAAG ACCATTGACG GTGGATTGTC 6720 

ATGTCAGTAT GAACACCAAT TTGTCATTAC GAAAGATGGA CCTCTTATCT TGACTAGCCA 6780 

AGGTGAAGAA GGAACTTATT AATAAAAAGT GAAAAGACTA CTGGAAGTTT ATTTTGATAA 6840 

AAAATCCAGT AGATCTTTTC ATAATAAAAC GCATTGTATC AAGTGTTAGG GGCTGATATC 6900 

ATGCGTTTTT CTGCTTTTAA GATTTTTTCC AACTCTGTTT GTAAGCGCAT CATAACAAAG 6960 
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GGTCTAGGAT TCAGGGCTCT CCTCCTATAT ACTATTAGTA AAGTAAAACT AAGGGAGGAT 7020 

ATTTTAGTGT CGCAGTCTAT TGTTCCTGTA GAGATTCCAC AATATTGTCG TTTTGATTCT 7080 

AAAAAGAGAA ATGGAATTCT GTTTAATGTT CGTATTGCCA ATCTTAAATT TACTTTTTTA 7140 

TATTATACTT CCTGCGAAAC AAAATATGGT ATAGTAGTTC TATGAATGAT GAAGCAAGTA 7200 

AACAACTAAC TGATGCACGA TTTAAGCGTC TTGTTGGTGT TCAGCGTACC ACTTTTGAAG 7260 

AGATGTTAGC TGTATTAAAA ACAGCTTATC AACTTAAACA CGCAAAAGGT GGACGAAAAC 7320 

CTAAATTAAG CCTAGAAGAC CTTCTTATGC CCACTCTTCA ATAGTGCGAG AATATCGAAG 7380 

TTATGAAGAA ATTGCGGCTG ATTTTGGTAT TCACGAAAGC AACTTTATCC GTCGGAGCCA 7440 

ATGGGTTGAA ATAACTCTTG TTCAAAGTGG TTTTACGGTT TCAAGAACTC CTCTCAGTTC 7500 

TGAGGACACG GTAATGATTG ATGCGACGGA AGTAAAAATC AATCGCCCTA AAAAAACAAT 7560 

TAGCGAATGA TTCTGGTAAA AAGAAATTTC ACGCTATGAA GGCTGAAGCG ATTGTCACAA 7620 

GTCAAGGGAG AATTGTTTCT TTGGATATCG CTGTGAACTA TAGTCATGAT ATGAAGTTGT 7680 

TCAAAATGAG TCGTAGAAAT ATCGAACAAG CTGGTAAAAT CTTGGCTGAC AGTGGTTATC 7740 

AAGGGCTCAT GAAGATATAT CCTCAAGCAC AAACTCCACG TAAATCCAGC AAACTCAAGC 7800 

CGCTAACAGC TGAAGATAAA GCCTATAACC ATGCGCTATC TAAGGAAAGA AGCAAGGTTG 7860 

AGAACATCTT TGCCAAAGTA AAAACGTTTA AAATATTTTC AACAACCTAT CGAAATGATC 7920 

GTAAACGCTT CGGATTACGA ATGAATTTGA GTGCTGGTAT TATCAATCAT GAACTAGGAT 7980 

TCTAGTTTTG CAGGAAGTCT ATTGAGGTAT TGAGCTAGTT TATGAAAAAA TTGGGTGAAA 8040 

AGTCGAGTGT TTTAGAAACC CACAGTGTAG TATTCTAGTT TCAATCCACT ATATTTTGCT 8100 

ACTCCCCGTA AAGTTTCTAT TTTCCCTGAT TTCTGATATA ATAGAAATAT TGACTTCAAG 8160 

AGTAAGGAAG AGAAGATGAA CGCATTATTA AATGGAATGA ATGACCGTCA GGCTGAGGCG 8220 

GTGCAAACGA CAGAAGGTCC CTTGCTAATC ATGGCAGGGG CTGGTTCTGG AAAGACTCGT 8280 

GTTTTGACCC ACCGTATCGC TTATTTGATT GATGAAAAGC TGGTCAATCC TTGGAATATC 8340 

TTGGCCATTA CCTTTACCAA CAAGGCTGCG CGTGAGATGA AAGAGCGTGC TTATAGCCTC 8400 

AATCCAGCGA CTCAGGACTG TCTGATTGCG ACCTTCCACT CCATGTGTGT GCGTATTTTG 8460 

CGTCGCGATG CGGACCATAT TGGCTACAAT CGTAATTTTA CAATTGTGGA TCCTGGTGAA 8520 

CAGCGAACGC TCATGAAACG TATTCTCAAA CAGTTGAACT TGGACCCTAA AAAATGGAAT 8580 

GAACGAACTA TTTTGGGGAC CATTTCCAAT GCTAAGAATG ATTTGATTGA TGATGTTGCT 8640 

TATGCTGCCC AAGCTGGCGA TATGTATACG GAAATTGTGG CCCAGTGTTA TACAGCCTAT 8700 
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CAAAAAGAAC TTCGTCAGTC TGAATCCGTT GACTTTGATG ATTTGATTAT GCTGACCTTG 8760 

CGTCTCTTTG ATCAAAATCC TGATGTTTTG ACCTACTACC AGCAAAAATT CCAATACATC 8820 

CACGTTGATG AGTACCAAGA TACCAACCAC GCTCAGTACC AATTGGTCAA ACTCTTGGCT 8880 

TCCCGTTTTA AAAATATCTG TGTGGTTGGG GATGCGGACC AGTCTATCTA CGGTTGGCGT 8940 

GGTGCTGATA TGCAGAATAT CTTGGACTTT GAAAAGGATT ACCCCAAAGC CAAGGTTGTT 9000 

TTGTTGGAGG AAAATTACCG CTCAACCAAA ACCATTCTCC AAGCGGCCAA CGAGGTTATT 9060 

AAAAATAATA AAAATCGCCG TCCTAAAAAT CTCTGGACTC AAAACGCTGA TGGGGAGCAA 9120 

ATCGTTTACT ATCGTGCCGA TGATGAGCTG GATGAGGCTG TATTTGTAGC CAGAACCATC 9180 

GATGAACTTA GTCGCAGTCA AAACTTCCTT CATAAGGATT TTGCAGTTCT CTATCGGACT 9240 

AATGCCCAGT CCCGTACAAT TGAGGAAGCC CTGCTCAAGT CTAACATTCC TTATACCATG 9300 

GTTGGCGGAA CCAAATTCTA CAGCCGTAAG GAAATTCGCG ATATTATTGC TTATCTCAAC 9360 

CTTATTGCTA ATTTGAGTGA CAATATTAGT TTTGAGCGTA TTATCAACGA GCCTAAACGT 9420 

GGAATTGGTC TAGGTACAGT TGAGAAAATC CGTGATTTTG CAAATTTGCA AAATATGTCT 9480 

ATGCTGGATG CTTCTGCTAA TATTATGTTG TCTGGTATCA AGGGTAAGGC AGCCCAATCT 9540 

ATCTGGGATT TTGCCAATAT GATGCTTGAT TTGCGGGAGC AGCTAGACCA CTTAAGCATT 9600 

ACAGAGTTGG TTGAGTCCGT CCTAGAAAAA ACAGGTTATG TCGATATTCT TAACTCCCAA 9660 

GCGACTCTAG AAAGCAAGGC ACGGGTTGAA AATATCGAAG AGTTTCTTTC TGTTACGAAG 9720 

AACTTTGATG ACACCACGGA TGTGACAGAA GAGGAAACTG GTCTGGACAA ACTGAGTCGT 9780 

TTCTTAAATG ACTTGGCTTT GATTGCCGAC ACAGATTCAG GTAGTCAGGA GACATCAGAA 9840 

GTGACCTTGA TGACCCTGCA TGCTGCCAAA GGTCTCGAAT TTCCAGTTGT CTTTTTGATT 9900 

GGGATGGAAG AAAATGTCTT TCCACTTAGT CGTGCGACTG AAGATTCAGA TGAATTAGAA 9960 

GAAGAGCGCC GTCTAGCCTA TGTAGGTATC ACGCGTGCAG AGAAAATTCT CTATCTGACC 10020 

AATGCCAACT CACGCTTGCT TTTTGGTCGT ACCAATTATA ACCGTCCGAC TCGTTTTATT 10080 

AACGAAATCA GTTCAGACTT GCTTGAGTAT CAAGGTCTGG CTCGTCCTGC AAATACAAGC 10140 

TTTAAGGCAT CATATAGCAG TGGTAGTATT TCCTTTGGTC AAGGTATGAG TTTGGCTCAG 10200 

GCTCTTCAAG ACCGTAAACG CGGTGCTGCC CCAAAATCAA TCCAGTCAAG CGGTCTTCCA 10260 

TTTGGTCAAT TTACAGCTGG CGCAAAACCA GCATCTAGCG AGGCAAATTG GTCCATTGGT 10320 

GATATTGCTC TCCACAAGAA ATGGGGAGAG GGAACCGTTC TGGAAGTTTC AGGTAGCGGT 10380 

GCTAGGCAGG AATTGAAAAT CAATTTCCCA GAAGTAGGTT TGAAAAAACT TTTAGCCAGT 10440 

GTGGCTCCAA TTGAGAAAAA AATCTAATTT TCCATCCTTC TCACGAATAA TAAAGTGAGG 10500 
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AGGATTTTTA 


TGTACAGTAT 


TTCATTCCAA GAAGATTCAC 


TATTACCAAG 


AGAAAGGCTG 


10560 


GCCAAGGAAG 


GAGTTGAAGC 


GCTTAGTAAC 


CAAGAGTTGC 


TAGCTATTTT 


ACTCAGGACA 


10620 


GGAACACGTC 


AAGCTAGCGT 


TTTTGAAATT 


GCCCAAAAAG 


TCTTGAACAA 


TCTTTCAAGC 


10680 


CTAACGGATT 


TGAAAAAAAT 


GACCCTGCAG 


GAATTGCAGA 


GTTTGTCTGG 


TATTGGGCGT 


10740 


GTTAAGGCCA 


TAGAATTACA 


AGCTATGATT 


GAACTGGGGC 


ATCGTATTCA 


CAAACACGAG 


10800 


ACTCTTGAAA 


TGGAAAGTAT 


TCTCAGCAGT 


CAAAAGTTGG 


CCAAGAAGAT 


GCAGCAGGAA 


10860 


TTAGGGGATA 


AAAAACAAGA 


GCACCTGGTG 


GCACTCTATC 


TCAATACTCA 


AAATGAAATC 


10920 


ATCCATCAGC 


AGACCATTTT 


TATCGGGTCT 


GTAACTCGTA 


GTATCGCTGA 


ACCGCGAGAG 


10980 


ATTCTTCACT 


ATGCAATCAA 


GCATATGGCG 


ACTTCTCTTA 


TCTTGGTCCA 


CAATCATCCT 


11040 


TCAGGAGCGG 


TAGCGCCTAG 


CCAAAATGAT 


GATCATGTCA 


CTAAACTTGT 


TAAAGAAGCC 


11100 


TGCGAATTGA 


TGGGGATTGT 


TCTCTTGGAC 


CATTTGATTG 


TCTCTCATTC 


TAATTACTTT 


11160 


AGTTATCGTG 


AAAAGACAGA 


TTTAATCTAA 


AGTTCATTAA 


CGACATAGTC 


AAAGAGTTTT 


11220 


TTATCTTTGG 


GACGATTTTC 


AAAAAGAAGT 


TCTGGATGCC 


ATTGGACACC 


GAGAAAGGCG 


11280 


ACATCATCCG 


TACTCATGAC 


AGCCTCAATG 


ATACCATCTT 


TAGGATCATG 


AGCCACAACT 


11340 


TTTAAATTTG 


GTGCTAAGTC 


CTTGATGCTC 


TGGTGGTGGA 


AGGAGTTGAT 


ATGAGAGATT 


11400 


TCTCCATAGA 


TTTCTTGGAG 


AACGGTATCT 


GGTTCTGTTA 


CCAAGCGTTG 


AGTTGTGTAC 


11460 


TCAACAGAAG 


AATCCTGCCA 


ATGGTCTTCG 


ATATCTTGGT 


ACAAAGTTCC 


ACCCATGGCA 


11520 


ACGTTAAAGA GTTGGGTACC 


ACGGCAGACA 


GAGAAAATGG 


GCTTTTTCTG 


TTTAATAGCT 


11580 


TCCTTGATGA 


GGGCCAGTTC 


GAAGATATCT 


CTTTGAAGGT 


GATAGTCATC 


ACTATCAATG 


11640 


GTTTTGGGTT 


CGCCATAAAA 


TTTTGGATCG 


ACATTTTGCC 


CACCTGTCAA 


GATGAGCTTG 


11700 


TCAATCAAAC 


TGATATAGTG 


GCAGGCCATT 


TCTTGATCAC 


CAATCGGTAG 


GATGATGGGA 


11760 


ATCCCTCCAG 


CATCTTTAAC 


GCCTTCAACA AAGCCTTTTG 


CTGCGTAGCT 


CATCATGATG 


11820 


TCATCATCTG 


GATGAGTTTT 


TTCGTTTCCT 


GTAATCCCAA 


TAACTGGTTT 


TTTCATAAAA 


11880 


TGATTTTCGC 


TTTCTAATCC 


TCTTTTCGCA 


TGAAGTAGAG 


GAGGGTTTGG 


AGTTCACTTG 


11940 


TCAAATCGAC 


ATACTGAACG 


ACCACGTCTT 


TTGGTAAATG 


CAGATGGACT 


GGTGAAAAAC 


12000 


TGAGAATTCC 


TTTCACACCA 


GCATCAACCA 


AGAGATTAGC 


AACCTCTTGT 


GACTTGACGC 


12060 


TGGGAACAGT 


TAGGATAGCA 


GTCTTCACAT 


CAGCATCCTT 


GATTTTATCC 


TTGATCTGAG 


12120 


AAATCCCGTA 


AATGGGAATC 


CCGTCAGGAG 


TTTGGGTACC 


GACTTCAGGA 


TGGTCGTCTA 


12180 


GGTGAAAGGC 


CATGATAATC 


TTCATCTTGT 


TACGTTCGTG 


GAAGCGGTAG 


TGGAGAAGGG 


12240 
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CATGGCCCAT 


ATTTCCAATA CCAACCAGCA TGACATTGGT AATAGAGTTG 


TCATTGAGCA 


12300 


AATCGGCAAA AAATGTCATT AGTTTTTTGA CATCATAGCC 


AAAACCACGA 


CGACCAAGTT 


12360 


CACCAAAATA 


GGAAAAATCA CGACGTACGG 


TCGCTGAATC 


AATACCGATA 


GCCTCTGCAA 


12420 


TTTGCTTAGA GTTGGCACGT TCAATCTTTT CTGCATGAAA 




ATTCGATAGT 


12480 


AGAGAGAGAG 


TCTTTTTGCT GTAGCTTTTG 


GAATAGCAAA 


X. \J X X X £\ 1 V* X 


TTCACAAAAT 


12540 


CACAACCTTT 


CTATTCTTCT ATTTTATAGA 


AACATTGTGA 




AAAAATAAGA 


12600 


AAAAACTAAG 


AAAAATCTTA GTTTTGATGT 


AAAAAATCTG 


CATGAGATAG 


AAAACGGTAG 


12660 


AGGTCTCCGA 


CCAGCCCCTG ATAAACTTTT 


TTGCCCCTAA 


AAGTCAGAGA 


AGTCACATAA 


12720 


AGTGTATCTG 


GTAAGGTTAC ACATCCTGAC 


AAAGTCAACA 


TGAGAGCCTC 


ATGATCCTCA 


12780 


TACTTGAGAG 


TACGCTCTAC ATGATAGCAG 


TCCTTATAGG 


TCAGTTCAAA 


CATTTTGGCT 


12840 


CTATCTTTCC 


GATTTTGTAA AGACACCACG 


TTCTACCAAG 


CTATCCATGA 


GGAAGTAGAA 


12900 




TGAATATGGT GGTCTTCTGA 


TTTGAAAATA 


TCAACTAGAC 


GAAGGCCAAA 


12960 


CTTGTCAGTG 


ATATTGATTT TAGCCCCTGT 


AAGTTCCTTG 


TTAATGATGA 


TTTTGAGTTG 


13020 


GAAGCCTTCA 


CCGCTGTTTG GCACTTTTTC 


CAAAAGGCGA 


GTCAGTTCAT 


AGTTACCAAC 


13080 


CTTAGTTT CA 


AAAAAGGTGT TATCTTTGAG 


GGTGAATTTT 


TTAACAGAAG 


GGCTAAGAGT 


13140 


GTAATCGTAA 


CGACAATTTT TTAACTGAAT GATTTTTTCA AATGCCATAT 


GGCTAACCTC 


13200 


CGATAATTTC 


TTTTAAGGTT TTTGCGAGGG 


TTTGTAGGTC 


TTCAACGGTA 


TTTTGTGGCG 


13260 


ACAAACTGAT 


GCGAAGGGAT TCCTTCAAGC 


GTTCTGAATT 


TGCGCCATAC 


ATGGCTTCAA 


13320 


GAACATGGCT 


GGATTGGACA ACGCCTGCAG 


TACAGGCTGA 


GCCAGTAGAG 


ATTGAAATTC 


13380 


CAGCTAAATC 


TAGCCGAAGG AGTAAGAGGT 


CATTTTTCTG 


ACCAGGAAAT 


CCAATATTGA 


13440 


GAACATAAGG 


GAGATGATGT TTTCCTCTAT 


TCAGGTAATA 


CTGAATGCCC 


TCCAGCTCTG 


13500 


CCAGAAAGGC 


AGTTTCTAGA TTTTGTACAT 


GTTGAAAATG 


TTCTTCTTGT 


TTTTCTAGGT 


13560 


CTTCTTTTAG 


GGCTGCAACC ATGCCTACAA 


TGGCAGGCAG 


ATTTTCAGTT 


CCTGCACGTT 


13620 


TTTTCTGTTC 


CTGGTCTCCG CCATGTAGAT 


AGGAATCAAA 


GTCCATGCTA 


GATGCGTAGA 


13680 


GAAAACCGAT 


TCCCTTAGGA CCATGGAATT TGTGGGCAGA AGCAGTGAGA AAATCAATGC 


13740 


CCAATTCTTC 


TGAATGAATT GGGATTTTAC 


CAATAGCCTG 


AACTGCATCA 


ACATGATAGG 


13800 


CAGCAGGGTG 


TTGCTTGAGT ATTTGGCCAA 


TTTCAGCGAT 


GGGCAGTAGG 


TTTCCTGTCT 


13860 


CATTATTGAC 


AAACATGGTA GAAACCAAAA 


TCGTATCGTC 


ACGTAAAGCC 


TTTTGAATTT 


13920 


GCTGGGCTGT 


GATTTCTTGA TTTTCTGGCT GGATAATGGT TGCTTCAAAC CCAAAGTGTT 


13980 


GAACCAAGTA ATCAATTGTT TCAAGGACAG 


CATGGTGCTC 


GATGGCAGTT 


GTGATGATAT 


14040 
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GTTTTCCTTG 


TTCTTGGTGA 


CGAAGACAGT 


AGCCAATGAT 


GGTAGTATTA 


TTGCCTTCAG 


14100 


TCCCACCAGA 


AGTGAAAAAG 


ATATGTTGAG 


GTTTTGTCCT 


TAGTAACTGG 


GCTAGTTCCT 


14160 


GACGGGCTTC 


TCGCAAGAGT 


TTGCCAGCTT 


GACGACCATG 


ACCATGAATA 


CTAGAAGGAT 


14220 


TTCCGTGGGT 


TTCTTGCATA 


ACCTTGGTCA 


TAGCTGAAAT 


AGCAACTGCT 


GACATAGGAG 


14280 


TCGTTGGAGG 


ATTGTCCAAA 


TAAATCAAAG 


AATCACCTTA 


TTTCTTTTTA 


TTGTAGGCAA 


14340 


AGAGTGGGCT 


GACTGGTTTT 


CTTTCGTGAA 


TAGGGACGAT 


AGCATCACGA 


ATTAACTCAG 


14400 


TAGCAGTGAT 


GTAGCATACA 


TTTTTAGGAG 


TTTTTTCTTT 


TGTTGCTACT 


GAATCAGTCA 


14460 


CAAGAATTTC 


TTTAATATTA 


GTATTGTCAA GAAGCTCAGC 


AGCTCCCTGG 


ACGAAGAGAC 


14520- 


CGTGGCTAGA 


AACAGCATAA 


ATTTCTGTAG 


CTCCTTCACG 


TTCAACGATT 


TTAGAAGCTT 


14560 


CAGAGAAGGT 


ACGTCCTGTA 


TTTAAAATAT 


CATCAATCAA 


GATAGCTTTC 


TTACCTTCAA . 


14640 


CATCACCAAT 


AATATAACCT 


TCGTTACGAG 


TTGCATCGTC 


TTGAGGGTAG 


TCGATAATGG 


14700 


CGATAGGAGC 


ATCAAGATAT 


TCAGCCAGGC 


TAGGCGCACG 


TTTGACACCT 


GAATTTTTAG 


14760 


GGCTAACGAG 


AACAACATCT 


GAACCAAGCA 


ATCCTTTATC 


GCAGTAATGT 


TTTGCGAATA 


14820 


GGGGAACAGT 


GAAAAGATTA 


TCCACTGGAA 


TATCAAAGAA 


ACCTTGAACG 


TGAACGGCAT 


14880 


GCAAATCAAG 


AGTCAGGATA 


CGATCAACTC 


CAGCCTTAAC 


CAGCATATTG 


GCAACTAGTT 


14940 


TTGCTGTAAG 


TGGCTCACGA 


GGACAAGCAA 


TGCGGTCTTG 


ACGTGCATAG 


CCAAAATATG 


15000 


GAAGGACAAC 


GTTGATACTG 


TGGGCACTTG 


CACGGACACA 


AGCATCGACC 


ATGATTAACA 


15060 


ATTCCATTAG 


GTGGTTGTTG 


ACAGGGAAAC 


TTGTTGATTG 


GATGATGTAA 


ACATCATAAC 


15120 


CACGGACACT 


TTCTTCGATA 


TTTACTTGGA 


TTTCTCCGTC 


TGAAAATTGA 


CGTGATGATA 


15180 


GTTTTCCAAG 


TGGGACACCA 


ACAGCTTGGG 


CAATTTTTTG 


TGCAATCTCT 


TGGTTAGAGT 


15240 


TGAGTGCGAA 


AAGTTTCATG 


TTTTTTCTAT 


CTGACATTAT 


AGACCGTCCT 


CTGTAAACTT 


15300 


TATAAATCCT 


AGTTATATTT 


ACCTTACATA 


TATGAACTGG 


GATTTGTGTA 


TTTTTATCTT 


15360 


TTCTATTTTA 


CCAAAAAATG 


GAGATTATTT 


CAGCTATTTT 


TCATACTTTT 


GACAAATCGA 


15420 


ACCAATTTTG 


AAGGAGCTTT 


TTGATAGGAA 


ATCTGATTTT 


TCTCTAAAAA 


TTGTCGAAAA 


15480 


TCCTGTTTGC 


CTTGCTCATG 


ATTTTCCAGT 


TCAAGCTCCA 


ATTCGTAATC 


TGTTATATCA 


15540 


AAGTATCGGC 


TCTGATCCAG 


TGCCATGAGA 


CCAATAGCTG 


TTTTCATTTC 


ATAGCGAAGC 


15600 


GTTGTTAGAC 


AACCAAGAAC 


CTGCCAGTTC 


TTACTTTGGA 


TACCATGTTT 


CGCCAATTCA 


15660 


TCCAGTACTA 


GCCCTTGAGG 


AAGTTCTTCC 


TTACTCAGAT 


AGTTCTCAGC 


ATGTTTTAGT 


15720 


TGCAATTTTT 


GGTTGTATTC 


CATGTTTCCA 


ACACTCTGCG 


GGACTTTGAG 


TGTGAAGTCA 


15780 
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W'^vnu 1\. 1 1 LAAAuui AA I Vj L\j l_ A T A tiC viA L71 I I CT 


TTTCTCGCAG 


TTCAAAATCA 


15840 


GGCGTGTCGA TGTAGTAATT TGTTTGAAGA ACAGGAGTGA 


CACCTGTGAA 


CTGGTCTTTT 


15900 


AGACGATTGT ATTCATCTTT TTTCAATAGT GTTTTCAATT 


CAATTTCTAA 


ATGTTTCATT 


15960 


TTTCTTACCT TTTTTTATCG TTGAAAGCGG ATTTATGGTA 


TAATAAGCAT 


TGTATTTATT 


16020 


GTATATGAAT CTGGAGAAAA AATCAAAGAT ATTTTTGACG 


GATAATATGA 


GAACAAGGGA 


16080 


GAATATATGA CCTTAGAATG GGAAGAATTT CTAGATCCTT 


ACATTCAAGC 


TGTTGGTGAG 


16140 


TTAAAGATTA AACTTCGTGG TATTCGTAAG CAATATCGTA 


AGCAAAATAA 


GCATTCTCCA 


16200 


ATTGAGTTTG TGACCGGTCG AGTCAAGCCA ATTGAGAGCA 


TCAAAGAAAA 


AATGGCTCGT 


16260 


CGTGGCATTA CTTATGCGAC CTTGGAACAC GATTTGCAGG 


ATATTGCTGG 


CTTACGTGTG 


16320 


ATGGTTCAGT TTGTAGATGA CGTCAAGGAA GTAGTGGATA TTTTGC ACAA GCGTCAGGAT 


16380 


ATGCGAATCA TACAGGAGCG AGATTACATT ACTCATAGAA AAGCATCAGG CTATCGTTCC 


16440 


TATCATGTGG TAGTAGAATA TACGGTTGAT ACCATCAATG GAGCTAAGAC 


TATTTTGGCA 


16500 


GAAATTCAAA TTCGTACTTT GGCCATGAAT TTCTGGGCAA CGATAGAACA 


TTCTCTCAAC 


16560 


TACAAGTACC AAGGGGATTT CCCAGATGAG ATTAAGAAGC 


GACTGGAAAT 


TACAGCTAGA 


16620 


ATCGCCCATC AGTTGGATGA AGAAATGGGT GAAATTCGTG ATGATATCCA 


AGAAGCCCAG 


16680 


GCACTTTTTG ATCCTTTGAG TAGAAAATTA AATGACGGTG 


TAGGAAACAG 


TGACGATACA 


16740 


GATGAAGAAT ACAGGTAAAC GAATTGATCT GATAGCCAAT AGAAAACCGC 


AGAGTCAAAG 


16800 


GGTTTTGTAT GAATTGCGAG ATCGTTTGAA GAGAAATCAG 


TTTATACTCA 


ATGATACCAA 


16860 


TCCGGATATT GTCATTTCCA TTGGCGGGGA TGGTATGCTC TTGTCGGCCT TTCATAAGTA 


16920 


CGAAAATCAG CTTGACAAGG TCCGCTTTAT CGGTCTTCAT 


ACTGGACATT 


TGGGCTTCTA 


16980 


TACAGATTAT CGTGATTTTG AGTTGGACAA GCTAGTGACT 


AATTTGCAGC 


TAGATACTGG 


17040 


GGCAAGGGTT TCTTACCCTG TTCTGAATGT GAAGGTCTTT 


CTTGAAAATG 


GTGAAGTTAA 


17100 


GATTTTCAGA GCACTCAACG AAGCCAGCAT CCGCAGGTCT 


GATCGAACCA 


TGGTGGCAGA 


17160 


TATTGTAATA AATGGTGTTC CCTTTGAACG TTTTCGTGGA 


GACGGGCTAA 


CAGTTTCGAC 


17220 


ACCGACTGGT AGTACTGCCT ATAACAAGTC TCTTGGCGGT 


GCTGTTTTAC 


ACCCTACCAT 


17280 


TGAAGCTTTG CAATTAACGG AAATTGCCAG CCTTAATAAT 


CGTGTCTATC 


GAACACTGGG 


17340 


CTCTTCCATT ATTGTGCCTA AGAAGGATAA GATTGAACTT 


ATTCCAACAA 


GAAACGATTA 


17400 


TCATACTATT TCGGTTGACA ATAGCGTTTA TTCTTTCCGT 


AATATTGAGC 


GTATTGAGTA 


17460 


TCAAATCGAC CATCATAAGA TTCACTTTGT CGCGACTCCT 


AGCCATACCA 


GTTTCTGGAA 


17520 


CCGTGTTAAG GACGCCTTTA TCGGCGAGGT GGATGAATGA 


GGTTTGAATT 


TATCGCAGAT 


17580 
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GAACATGTCA 


AGGTTAAGAC 


CTTCTTAAAA AAGCACGAGG 


TTTCTAAGGG 


ATTGCTGGCC 


17640 


AAGATTAAGT 


TTCGAGGTGG 


AGCTATTCTG GTCAATAATC 


AACCGCAAAA TGCAACGTAT 


17700 


CTATTGGACG 


TTGGAGACTA 


CGTTACCATT GACATTCCCG 


CTGAGAAAGG 


CTTTGAAACC 


17760 


TTGGAGGCTA 


TTGAGCTTCC 


ATTAGATATT CTCTATGAGG 


ATGACCACTT 


TCTAGTCTTG 


17820 


AATAAACCCT 


ATGGAGTGGC 


TTCTATTCCT AGTGTCAATC 


ACTCTAATAC 


CATTGCCAAT 


17880 


TTTATCAAGG 


GTTACTATGT 


CAAGCAAAAT TATGAAAATC 


AGCAGGTTCA 


CATTGTTACC 


17940 


AGACTAGATA 


GGGATACTTC 


TGGCTTGATG CTCTTTGCCA 


AGCACGGTTA 


TGCCCATGCA 


18000 


CGATTAGACA 


AGCAGTTGCA 


GAAGAAATCT ATCGAGAAAC 


GCTACTTTGC 


TTTGGTTAAG 


18060 


GGAGATGGAC 


ATTTGGAGCC 


AGAAGGGGAA ATTATTGCTC 


CGATTGCGCG 


TGATGAAGAT 


18120 


TCCATTATTA 


CCAGACGAGT 


GGCTAAAGGC GGAAAGTATG 


CCCATACTTC 


ATACAAGATT 


18180 


GTAGCTTCTT 


ATGGAAATAT 


TCACTTGGTC TATATTCACC 


TGCACACTGG 


TCGAACCCAT 


18240 


CAAATCCGAG 


TCCATTTTTC 


TCATATCGGT TTTCCTTTGC 


TGGGAGATGA 


TTTGTATGGT 


18300 


GGTAGTCTGG 


AAGATGGTAT 


TCAACGTCAG GCTCTGCATT 


GCCATTACCT 


ATCCTTTTAT 


18360 


CATCCATTTT 


TAGAGCAAGA CTTGCAGTTA GAAAGTCCCT 


TGCCGGATGA 


TTTTAGTAAC 


18420 


CTTATTACCC AGTTATCAAC TAATACTCTA TAAAAACTGT CTCAGAGTAT AATTATTATC 


184B0 


TTAAAGGAGA AAACTCATGG AAGTTTTTGA AAGTCTCAAA GCCAACCTTG TTGGTAAAAA 


18540 


TGCTCGTATC 


GTTCTCCCTG 


AAGGGGAAGA GCCTCGTATT 


CTTCAAGCAA 


CAAAACGCTT 


18600 


AGTAAAAGAA 


ACAGAAGTGA 


TTCCTGTTTT GCTTGGAAAT 


CCTGAAAAAA 


TTAAAATTTA 


18660 


TCTTGAAATT GAAGGAATCA TGGATGGTTA TGAGGTCATC GACCCTCAAC ATTATCCTCA 


18720 


ATTTGAAGAA 


ATGGTTTCTG 


CCTTGGTGGA GCGTCGCAAG 


GGCAAAATGA 


CTGAAGAAGA 


18780 


TGTACGCAAG 


GTTTTGGTTG 


AAGATGTCAA CTACTTTGGT 


GTGATGTTGG 


TTTACTTGGG 


18840 


CTTGGTTGAT GGAATGGTGT CAGGAGCGAT TCACTCAACA GCTTCAACAG TTCGCCCAGC 


18900 


TCTACAAATC ATCAAAACTC GTCCAAATGT AACTCGTACT TCAGGAGCCT TCCTCATGGT 


18960 


TCGTGGTACG GAACGTTACC TATTTGGAGA CTGTGCCATT AACATCAATC CAGATGCAGA 


19020 


AGCCTTGGCT 


GAAATTGCCA 


TCAACTCAGC AATCACAGCT 


AAGATGTTTG 


GCATCGAACC 


19080 


TAAAATTGCC ATGTTGAGCT ATTCTACTAA AGGTTCAGGG TTTGGTGAAA GCGTTGATAA 


19140 


GGTCGTTGAA 


GCAACTAAAA 


TTGCTCACGA CTTGCGTCCT 


GACCTTGAAA 


TCGATGGTGA 


19200 


GTTGCAATTT 


GATGCAGCCT 


TTGTTCCTGA AACTGCAGCT 


CTGAAAGCTC 


CTGGAAGTAG 


19260 


GGTAGCTGGT 


CAAGCAAATG TCTTCATCTT CCCAGGTATC GAGGCAGGAA ATATTGGTTA 


19320 
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CAAGATGGCT GAACGCCTGG GTGGCTTTGC GGCTGTAGGA CCTGTTTTGC AAGGTTTAAA 19380 

CAAGCCAGTT AATGATCTTT CTCGTGGATG TAATGCAGAT GATGTTTACA AGTTGACCCT 19440 

CATCACAGCA GCTCAAGCAG TTCATCAATA GTGAAAACTA TAAAGTGATA TACTATGCTA 19500 

TACTGTAGTT ATGAAACTAT GTACGAAAAG CACTGCCATT AATTCCTGAG AACTAAATTA 19560 

CTGATTGGTG TCAAAAAGGA AAACTTCCAA GCGATGATAT CCTGTCTATA CACGACCTAT 19620 

AGAAATCTGT AATATACATA TCCGTAAAAC GATAAATTCC CTTTTTGATT TTAAATGAGT 19680 

ATGAAAAGAG AATTTTTTGG CTCTTTGTCA ACTGTAGTGG GTTGAAGAAA AGCTAAGCTC 19740 

GAGAAAGGAC AAATTTCATC CTTTCTTTTT TGATATTCAG AGCGATAAAA ATCCGTTTTT 19800 

TGAAGTTTTC AAAGTTCCGA AAACCAAAGG CATTGCGCTT GATAAGTTTG ATGAGATTAT 19860 

TGGTCGCTTC CAGTTTGGCG TTAGAATAGT GTAGTTGAAG GGCGTTGATA ATCTTTTCTT 19920 

TATCTTTGAG GAAGGTTTTA AAGACAGTCT GAAAAATAGG ATGAACCTGC TTAAGATTGT 19980 

CCTCAATAAG TCCGAAAAAT TTCTCTGGTT CCTTATTCTG GAAGTGAAAA AGCAAGAGTT 20040 

GATAGAGCTG ATAGTGGTGT TTCAAGTCTT CCGAATAGCT CAAAAGCTTG TTTAAAATCT 20100 

CTTTATTGGT TAAGTGCATA CGAAAAATAG GACGATAAAA TCGCTTATCA CTCAGTTTAC 20160 

GGCTATCCTG TTGAATGAGT TTCCAGTAGC GCTTGATAG 20199 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

ACCCGATGTA TCAGCGGATA TTTACTCTAT TTTTCAAACG ATGTTATACC CACAATAAAA 60 

GAAAAAAGAC CCTAAGGTCT CCTTTGCTTT TATTATTAAA CGCGTTCAAC TTTACCTGAT 120 

TTCAAAGCAC GAGCTGAAGC CCAAACTTTT TTAGGTTTAC CATCGATAAG AACAGTAACT 180 

TTTTGAAGGT TTGGTTTTAC GGCACGTTTT GTTTGGTTCA TCGCGTGTGA ACGGTTGTTT 240 

CCTGATACAG TCTTACGACC TGTAAAGTAA CATACTTTAG CCATTGTGTT TTCCTCCTAT 300 

TAGATCTAAT ATAGCGGATG TGCTAGCACC ACATACCGTA CTATGTTATC ACATTTTCTT 360 

GTTTTTTGCA AGGGAATTGG AAGATTTTTT ATTTGTGTCT TAAATCAGGT CTTGCGTGAC 420 

ATTTcTGCTC TCCACATGCC ATCGTTGATT AACAGAACAC CAGAATTAAA ATTATGTGTA 480 

TAAAAATCAT CTCTAACTGC AGCTAAGGGT ATAGCCGTCA AGTCCAAATC CCACAGCTCA 540 



WO 98/18931 



PCT/US97/19588 



195 



TCTATCGATT TTCTTACAAC AATATCTGAA TCCAAATACA GTACACGAGA CTCGCTTACA 


600 


TACTTTGGAA 


TAAAATACCT 


AAAAAAGCCG 


CATATGAAAG 


TCCCTCAAAG 


GGGAGACGAT 


660 


AACCTTTCAG 


AATATTACTG 


TCAATCTAAA CATTCACAAT CTCACTATTC AAAGTCTCTA 


720 


GTCTTTTTTC CATCAATTGG AACCATTCTC 


GCGGAAGGTC 


ATCATTAAAA 


ACATAAAACT 


780 


TAAGATTATA ATGATGAACA CAAAGAGATT 


TTATTGTTGT 


TTCAACTTTA 


TCCATATAAG 


840 


GATTATCTGC 


ACCTAAGACA 


ATCGCTTTTT 


TCTCTTCTTT 


CACTTTTTAT 


CTCATTTCTT 


900 


TTTATTCCCA TCATATTATT CCCATCATAT 


GTTTCCCATC 


ATATGTTTCT 


ACGTAACCAT 


960 


TATTTTCGCC 


TATTCGTTCG 


TAAAACCATA 


CCAGTGGAGA 


TTTTAGATGA 


AGTCCCATTA 


1020 


CGGTTTACAA 


TTTTTACATT 


ACGACACGGA 


GTTTTACAAA 


TCGATTTGAT 


TTGCCAAACG 


1080 


TAGTTAGTGA 


GGGAGTTAGC 


TAGTTCGCCA 


AATAGGGACT 


AGCGTCCAAC 


AATTTGGAAC 


1140 


TTTAGTTCCA 


ATTGTTGGTA 


CTGAGTCACA 


TCTTCTCCTC 


TAACTCTACG 


TCTGGATACT 


1200 


TGTCCGCAAA 


CCAGCGGAGG 


GCAAAGTCAT 


TTTCAAAGAG 


AAAGACTGGT 


TGGTCAAAAC 


1260 


GGTCTTTGGC 


TAAGATATTG 


CGACTTGACG 


ACATCCGTTC 


ATCCAAGTCC 


TCAGGCTTGA 


1320 


TCCAACGAAC 


GGTCTTTTTA 


CCCATTGGGT 


TCATAACTAC 


TTCCGCATTG 


TACTCGCCTT 


1380 


CCATGCGGTG 


TTTAAAGACT 


TCAAACTGGA 


GTTGACCTAC 


AGGGCCTAGC 


ATGTACTCAC 


1440 


CTGTTTGGTA 


ATTCTTATAA 


AGCTGAAGGG 


CTCCTTCTTG 


CACCAATTGG 


TCAATCCCCT 


1500 


TGTGGAAGGA 


TTTTTGCTTC 


ATAACATTCT 


TAGCAGAAAC 


TTTCATGAAA 


ATCTCAGGTG 


1560 


TAAAGGTTGG 


CAGGGGTTCA 


AATTCAAACT 




AACCGTCAAG 


GTATCCCCAA 


1620 


CCTGATAAGT 


ACCGGTATCG 


TAAACCCCGA 


TAATATCACC 


TGCCACGGCA 


TTGGTCACAT 


1680 


TCTCACGACT 


CTCCGCCATA 


AACTGGGTAA 


CATTAGATAG 


TTTAGCCCCC 


TTACCAGTAC 


1740 


GAGGGAGATT 


GACACTCATG 


CCGCGCTCAA 


ATTCGCCAGA 


TACGATACGG 


ACAAAGGCAA 


1800 


TACGGTCACG 


GTGACGAGGG 


TCCATGTTGG 


CTTGGATTTT 


AAAGACAAAG 


CCTGAGAAAT, 


1860 


CCTTGTCATA 


AGGATCCACA 


ATTTCACCGT 


CTGTTTTCTT 


GTGACCATGT 


GGTTCTGGAG 


1920 


CAAACTTGAG 


GAAGGTTTCA 


AGGAAGGTCT 


GCACACCAAA 


GTTTGTCAGG 


GCTGAACCGA 


1980 


AAAAGACAGG 


CGTCAATTCT 


CCAGCCAGAA 


TAGCTTCCTC 


TGAAAACTCA 


TTCCCGGCTT 


2040 


CATTTAAAAG 


CTCAATGTCA 


TCCTTGACTT 


GCTCGTAGAA 


AGGATTGCTA 


CCAAAGAGTT 


2100 


TGTCCCCGTC TTCTAGACTG 


GCAAAACGCT 


CATCCCCTTT 


GTAAAGCTCT 


AAACGTTGGT 


2160 


TATAGAGGTC 


ATACAAGCCC 


TCAAAGGCTT 


TCCCCATCGC 


GATAGGCCAG 


TTCATAGGGT 


2220 


AGCTAGCAAT 


GCCCAAGATT 


TCTTCCAATT 


CTTGCAAGAG 


ATCCAAAGGC 


TCAGGAGCGT 


2280 
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CACGGTCCAG CTTGTTCATA AAGGTAAAGA CTGGAATGCC ACGATGTTTC ACAACCTCAA 2340 

ACAATTTCTT GGTTTGAGCC TCGATCCCCT TGGCAGAGTC CACGACCATG ACCGCAGCAT 2400 

CCACCGCCAT CAAGGTACGA TAGGTATCTT CTGAGAAGTC CTCGTGCCCT GGCGTGTCTA 2460 

AGATATTCAC GCGCTTGCCG TCGTAGTCAA ATTGCATAAC AGATGAAGTA ACAGAAATCC 2520 

CACGTTGCTT CTCGATATCC ATCCAGTCAG ATTTAGCAAA AGTCCCTGTT TTCTTCCCTT 2580 

TTACCGTACC AGCCTCACGA ATCTCACCCC CAAAGTAGAG TAACTGCTCA GTGATGGTTG 2640 

TTTTCCCCGC GTCCGGGTGG GAGATAATGG CAAAGGTACG ACGTTTCTTA ATTTCTTCTT 2700 

GAATATTCAT AAGTTCTCTT TCTTTGATTC TCTATTTTTC TTGTTTCAAT AGCTGAGAAT 2760 

GATTTTTACA TTGGATTTTA CCATTCCTTT CAACACTCCA TTATATCGGA TTTTAGCATT 2820 

TTTTTCAATT TCTATTTCTT TTCACTTCCC CCTCGCTTAT TTATAGGAAA ATATGGTAAA 2880 

ATAGAACAGA CTAAAAATCA TCATTTCACG AAAGGATGCA AGATGAAAAT TACGCAAGAA 2940 

GAGGTAACAC ACGTTGCCAA TCTTTCAAAA TTAAGATTCT CTGAAGAAGA AACTGCTGCC 3000 

TTTGCGACCA CCTTGTCTAA GATTGTTGAC ATGGTTGAAT TGCTGGGCGA AGTTGACACA 3060 

ACTGGTGTCG CACCTACTAC GACTATGGCT GACCGCAAGA CTGTACTCCG CCCTGATGTG 3120 

GCCGAAGAAG GAATAGACCG TGATCGCTTG TTTAAAAACG TACCTGAAAA AGACAACTAC 3180 

TATATCAAGG TGCCAGCTAT CCTAGACAAT GGAGGAGATG CCTAATGACT TTTAACAATA 3240 

AAACTATTGA AGAGTTGCAC AATCTCCTTG TCTCTAAGGA AATTTCTGCA ACAGAATTGA 3300 

CCCAAGCAAC ACTTGAAAAT ATCAAGTCTC GTGAGGAAGC CCTCAATTCA TTTGTCACCA 3360 

TCGCTGAGGA GCAAGCTCTT GTTCAAGCTA AAGCCATTGA TGAAGCTGGA ATTGATGCTG 3420 

ACAATGTCCT TTCAGGAATT CCACTTGCTG TTAAGGATAA CATCTCTACA GACGGTATTC 3480 

TCACAACTGC TGCCTCAAAA ATGCTCTACA ACTATGAGCC AATCTTTGAT GCGACAGCTG 3540 

TTGCCAATGC AAAAACCAAG GGCATGATTG TCGTTGGAAA GACCAACATG GACGAATTTG 3600 

CTATGGGTGG TTCAGGTGAA ACTTCACACT ACGGAGCAAC TAAAAACGCT TGGAACCACA 3660 

GCAAGGTTCC TGGTGGGTCA TCAAGTGGTT CTGCCGCAGC TGTAGCCTCA GGACAAGTTC 3720 

GCTTGTCACT TGGTTCTGAT ACTGGTGGTT CCATCCGCCA ACCTGCTGCC TTCAACGGAA 3780 

TCGTTGGTCT CAAACCAACC TACGGAACAG TTTCACGTTT CGGTCTCATT GCCTTTGGTA 3840 

GCTCATTAGA CCAGATTGGA CCTTTTGCTC CTACTGTTAA GGAAAATGCC CTCTTGCTCA 3900 

ACGCTATTGC CAGCGAAGAT GCTAAAGACT CTACTTCTGC TCCTGTCCGC ATCGCCGACT 3960 

TTACTTCAAA AATCGGCCAA GACATCAAGG GTATGAAAAT CGCTTTGCCT AAGGAATACC 4020 

TAGGCGAAGG AATTGATCCA GAGGTTAAGG AAACAATCTT AAACGCGGCC AAACACTTTG 4080 
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AAAAATTGGG 


TGCTATCGTC 


GAAGAAGTCA 


GCCTTCCTCA 


CTCTAAATAC 


GGTGTTGCCG 


4140 


TTTATTACAT 


CATCGCTTCA 


TCAGAAGCTT 


CATCAAACTT 


GCAACGCTTC 


GACGGTATCC 


4200 


GTTACGGCTA 


TCGCGCAGAA 


GATGCAACCA 


ACCTTGATGA 


AATCTATGTA 


AACAGCCGAA 


4260 


GCCAAGGTTT 


TGGTGAAGAG 


GTAAAACGTC 


GTATCATGCT 


GGGTACTTTC 


AGTCTTTCAT 


4320 


CAGGTTACTA 


TGATGCCTAC 


TACAAAAAGG 


CTGGTCAAGT 


CCGTACCCTC 


ATCATTCAAG 


4380 


ATTTCGAAAA 


AGTCTTCGCG 


GATTACGATT 


TGATTTTGGG 


TCCAACTGCT 


CCAAGTGTTG 


4440 


CCTATGACTT 


GGATTCTCTC AACCATGACC CAGTTGCCAT GTACTTAGCG GACCTATTGA 


4500 


CCATACCTGT 


AAACTTGGCA 


GGACTGCCTG 


GAATTTCGAT 


TCCTGCTGGA 


TTCTCTCAAG 


4560 


GTCTACCTGT 


CGGACTGCAA 


TTGATTGGTC 


CCAAGTACTC 


TGAGGAAACC 


ATTTACCAAG 


4620 


CTGCTGCTGC 


TTTTGAAGCA 


ACAACAGACT 


ACCACAAACA 


ACAACCCGTG 


ATTTTTGGAG 


4680 


GTGACAACTA 


ATGAACTTTG 


AAACAGTCAT 


CGGACTTGAA 


GTCCACGTAG 


AGCTCAACAC 


4740 


CAATTCAAAA 


ATCTTCTCAC 


CTACTTCTGC 


CCACTTTGGA 


AATGACCAAA 


ATGCCAACAC 


4800 


TAACGTGATT 


GACTGGTCTT 


TCCCAGGAGT 


TCTACCAGTT 


CTCAATAAAG 


GGGTTGTTGA 


4860 


TGCCGGTATC 


AAGGCTGCTC 


TTGCCCTCAA 


CATGGACATC 


CACAAAAAGA 


TGCACTTTGA 


4920 


CCGCAAGAAC 


TACTTCTATC 


CTGATAACCC 


CAAAGCCTAC 


CAAATTTCTC 


AGTTTGATGA 


4980 


ACCAATCGGA 


TATAATGGCT 


GGATTGAAGT 


CAAACTAGAA 


GACGGTACGA 


CCAAGAAAAT 


5040 


CGGTATCGAA 


CGTGCCCACC 


TAGAGGAAGA 


CGCTGGTAAA 


AACACCCATG 


GTACAGATGG 


5100 


CTACTCTTAT 


GTTGACCTCA 


ACCGCCAAGG 


GGTTCCCTTG 


ATTGAGATTG 


TATCTGAGGC 


5160 


AGATATGCGT 


TCTCCTGAAG 


AAGCCTATGC 


TTATCTGACA 


GCCCTCAAGG 


AAGTTATCCA 


5220 


GTACGCTGGC 


ATTTCTGACG 


TTAAGATGGA 


GGAAGGTTCG 


ATGCGTGTGG 


ATGCCAACAT 


5280 


CTCCCTTCGT 


CCTTATGGTC 


AAGAGAAATT 


CGGTACCAAG 


ACTGAATTGA 


AGAACCTCAA 


5340 


CTCCTTCTGA 


AACGTTCGTA 


AAGGTCTTGA 


ATACGAAGTC 


CAACGCCAGG 


CTGAAATTCT 


5400 


TCGCTCAGGT GGTCAAATCC GCCAAGAAAC ACGCCGTTAC GATGAAGCGA ATAAAGCAAC 


5460 


CATCCTCATG 


CGTGTCAAGG 


AAGGGGCTGC 


TGACTACCGC 


TACTTCCCAG 


AACCAGACCT 


5520 


ACCCCTCTTT 


GAAATTTCTG 


ACGAGTGGAT 


TGAGGAAATG 


CGGACTGAGT 


TGCCAGAGTT 


5580 


TCCAAAAGAA 


CGTCGTGCGC 


GTTATGTATC 


TGACCTTGGT 


TTATCAGACT 


ACGATGCTAG 


5640 


TCAGTTGACT GCTAATAAAG TCACTTCTGA CTTCTTTGAA AAAGCTGTTG CCCTAGGTGG 


5700 


TGATGCCAAA 


CAAGTCTCTA 


AGTGGCTCCA 


AGGGGAAGTC 


GCTCAGTTCT 


TGAATGCTGA 


' 5760 


AGGTAAAACA 


CTGGAACAAA 


TCGAATTGAC 


ACCAGAAAAC 


TTGGTTGAAA 


TGATTGCCAT 


5820 
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CATCGAAGAC GGTACTATTT CATCTAAGAT TGCCAAGAAA GTCTTTGTCC ATCTAGCTAA 5880 

AAATGGCGGT GGCGCGCGTG AATACGTGGA AAAAGCAGGT ATGGTTCAAA TTTCAGATCC 5940 

AGCTATCTTG ATCCCAATCA TCCACCAAGT CTTTGCCGAT AACGAAGCTG CTGTTGCCGA 6000 

CTTCAAGTCA GGCAAACGTA ACGCCGACAA GGCtTTACAG GATTCCTTAT GAAGGCAACC 6060 

AAAGGCCAAG CCAACCCACA AGTTGCCCTT AAACTACTTG CACAGGAATT GGCGAAGTTG 6120 

AAAGAAAACT AGACAGAACA AAACCAGCCC TAAGGTTGGT TTTTTCTTCT CTACCAACTC 6180 

CCAATAACTA TTTTGGCTTT ATTTCCAGAG TATTTTATGG TAAAATGAAG AGTAATAATA 6240 

TTTATTAAAG AGGTAAAAAC ATGATTGAAG CAAGTACCTT AAAAGCTGGT ATGACCTTTG 6300 

AAACAGCTGA CGGCAAATTG ATTCGCGTTT TGGAAGCTAG TCACCACAAA CCAGGTAAAG 6360 

GAAACACGAT CATGCGTATG AAATTGCGTG ATGTCCGTAC TGGTTCTACA TTTGACACAA 6420 

GCTACCGTCC AGAGGAAAAA TTTGAACAAG CTATTATCGA GACTGTCCCA GCTCAATACT 6480 

TGTACAAAAT GGATGACACA GCATACTTCA TGAATACAGA AACTTATGAC CAATACGAAA 6540 

TCCCTGTAGT CAATGTTGAA AACGAATTGC TTTACATCCT TGAAAACTCT GATGTGAAAA 6600 

TCCAATTCTA CGGAACTGAA GTGATCGGTG TCACCGTTCC TACTACTGTT GAGTTGACAG 6660 

TTGCTGAAAC TCAACCATCT ATCAAAGGTG CTACTGTTAC AGGTTCTGGT AAACCAGCAA 6720 

CGATGGAAAC TGGACTTGTC GTAAACGTTC CAGACTTCAT CGAAGCAGGA CAAAAACTCG 6780 

TTATCAACAC TGCAGAAGGA ACTTACGTTT CTCGTGCCTA ATCTCTAGAA AGAGGTCATT 6840 

CTATGGGAAT TGAAGAACAA CTTGGCGAAA TCGTTATCGC CCCACGTGTA CTTGAAAAAA 6900 

TCATTGCTAT CGCTACTGCA AAGGTAGAGG GTGTTCACTC TTTTTCAAAC AGATCAGTGT 6960 

CTGATACCCT TTCAAAACTT TCACTCGGCC GTGGCATTTA TCTTAAAAAC GTGGACGAAG 7020 

AACTCACAGC AGATATCTAT CTCTACCTTG AGTACGGAGT AAAAGTTCCT AAGGTAGCGG 7080 

TTGCTATCCA GAAAGCTGTC AAAGATGCCG TCCGTAATAT GGCTGATGTA GAACTCGCTG 7140 

CTATCAATAT TCACGTTGCA GGTATCGTCC CAGATAAAAC ACCAAAACCA GAATTGAAAG 7200 

ATCTATTTGA CGAGGACTTC CTCAATGACT AGTCCACTAT TAGAATCTAG ACGCCAACTC 7260 

CGTAAATGCG CTTTTCAAGC TCTCATGAGC CTTGAGTTCG GTACGGATGT CGAAACTGCT 7320 

TGTCGTTTCG CCTATACTCA TGATCGTGAA GATACGGATG TACAACTTCC AGCCTTTTTG 7380 

ATAGACCTCG TTTCTGGTGT TCAAGCTAAA AAGGAAGAAC TAGATAAGCA AATCACTCAG 7440 

CATTTAAAAG CAGGTTGGAC CATTGAACGC TTAACGCTCG TGGAGAGAAA CCTCCTTCGC 7500 

TTGGGAGTCT TTGAAATCAC TTCATTTGAC ACTCCTCAGC TGGTTGCTGT TAATGAAGCT 7560 

ATCGAGCTTG CAAAGGACTT CTCCGATCAA AAATCTGCCC GTTTTATCAA TGGACTGCTC 7620 
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AGCCAGTTTG 


TAACAGAAGA 


ACAATAAGGC 


TCTTTGTCAA 


CTGTAGTGGG 


TTGAAAAAAA 


7680 


GCTAAGCTCG 


AGAAAGGACA 


AATTTCGTCC 


TTTCTTTTTT 


GATGTTCAAA 


GCGATAAAAA 


7740 


TCCGTTTTTT 


GAAGTTTTCA AAGTTTCGAA AACCAAAGGC 


ATTGCGCTTG 


ATAAGTTTGA 


7800 


TGAGATTATT 


GGTCGCTTCC 


AGTTTGGCAT 


TAGAATAGTG 


TAGTTGAAGG 


GCGTTGACAA 


7860 


TCTTTTCTTT ATCTTTGAGG AAGGTTTTAA AGACAGTCTG AAAAATAGGA TGAGCCTGCT 


7920 


TAAGATTGTC 


CTCAATAAGT 


CCGAAAAATT 


TCTCTGGTTC 


CTTATTCTGG 


AAGTGAAACA 


7980 


GCAAGAGCTG 


ATAGAGCTGA 


TAGTGGTGTT 


TCAAGTCTTG 


TGAATGGCTC 


AAAAGCTTGT 


8040 


CTAAAATCTC 


TTTATTGGTT 


AAGTGCATAC 


GAAAAGTAGG 


ACGATAAAAT 


CGCTTATCAC 


8100 


TCAGTCTACG 


GCTATCCTGT 


TGAATGAGTT 


TCCAGTAGCG 


CTTGATATCC 


TTGTATTCAT 


8160 


GGGATTTTCG 


ATGAAACTGA 


TTCATGATTT 


GGACACGCAC 


ACGACTCATG 


GCACGGCTAA 


8220 


GATGTTGTAC 


AATGTGAAAG 


CGATCAAGAA 


CGATTTTAGC 


ATTCGGGAGT 


GAAACAGTCT 


8280 


GGGAGACTGT 


TTCAGCCTGA 


GCCTAGGAAT 


TTGAAAGCGA 


AGCTGTTTAG 


CCAAGTCATA 


8340 


GTAAGGGCTA 


AACATATCCA 


TAGTAATAAT 


TTTGACGCGA 


CATCGGACAA 


CTCTATCGTA 


8400 


GCGAAGAAAG 


TGATTTCGAA 


TGATAGCTTG 


TGTTCTACCC 


TCAAGAACAG 


TGATGATATT 


8460 


GAGATTGTTA 


AAATCTTGCG 


CAATGAAGCT 


CATCTTTCCC 


TTTGTAAAAG 


CATACTCATC 


8520 


CCAAGACATA 


ATCTCAGGAA 


GACAAGAAAA 


ATCATGTTTA 


AAGTGAAAAT 


CATTGAGCTT 


8580 


ACGAATAACA 


GTTGAAGTTG 


AGATGGAAAG 


CTGATGGGCA 


ATATCAGTCA 


TAGAAATCTT 


8640 


TTCAATCAAC 


TTTTGAGCAA 


TCTTTTGGTT 


GATGATACGA 


GGGATTTGGT 


GATTTTTCTT 


8700 


GACGATAGAA 


GTTTCAGCGA 


CCATCATTTT 


TGAACAGTGA 


TAGCACTTGA 


ATCGACGCTT 


8760 


TCTAAGGAGA 


ATTCTAGTAG 


GCATACCAGT 


CGTTTCAAGA 


TAAGGAATTT 


TAGAAGGTTT 


8820 


TTGAAAGTCA 


TATTTCTTCA 


ATTGGTTTCC 


GCACTCAGGG 


CAAGATGGGG 


CGTCGTAGTC 


8880 


CAGTTTGGCG 


ATGATTTCCT 


TGTGTGTATC 


CTTATTGATG 


ATGTCTAAAA 


TCTGGATATT 


8940 


AGGGTCTTTA 


ATGTCTAGTA 


ATTTTGTGAT 


AAAATGTAAT 


TGTTCCATAT 


GAATCTTTCT 


9000 


AATGAGTTGT 


TTTGTCGCTT 


TTCATTATAG 


GTCATATGGG 


ACTTTTTTTC 


TACAATAAAA 


9060 


TAGGCTCCAT 


AATATCTATA 


GGGGATTTAC 


CCACTACAAA 


TATTATAGAG 


CCAACAATAA 


9120 


AAAGAAAAAG 


TGTTTGATAG 


ATATCAAACA 


CTTTTTTCTT 


TGCCTCCCAC 


TATCTAAAAA 


9180 


AATGATAATA 


GATATAATTG 


TAAACAAAAA 


TCCAGATAGG 


TTTTGCATGA 


TTGAGAAAGT 


9240 


TAAAAAAACT 


ATGGCAGAGA 


ATCGTTAATC 


TCAGATTGTC 


GGTAGAACGA 


TAAACAAGGG 


9300 


CAAAAAAGAA 


ACCAATCAGA 


CTATAATATA 


ATAAACTAAT 


TGGATCTCTG 


TGAGATAGTA 


9360 
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TCAAATGGCT AATCCCAAAG ATGATAGCAG ATAGGATAAC ATCCAAATAG TACTTGGACT 9420 

AGGGAAAGAA GGTATTCATA AAATACCCTC TATCAAGAGT CTCCTCAAAA ACAGGACCGA 9480 

TGATTACAGG CAGGACAAAA GATAAGATAG TCGATAAAAA GGTTGGTTGT CCATTTGAAA 9540 

AAAGCACGGT AAAATACTCA TCATGAATAT TCCTATGATT AATCAAATGA GCATAGCGTG 9600 

CCCAAAAATT ACCGAGAATC TGATAAACCA CATAAGTTGC AAATAAGTAG AAGACAAATG 9660 

ACCAGTTCCA GCTCTTTTTC TCAAAGATAA AGAGCATCTT TTTCTTTTTT AACCTCCAAA 9720 

TTAATAGAAG GAAACTTCCC ACTAATCCCA TTGTTAAAAT AAGAGAATAG ACATCAGCTC 9780 

CTAACCCTAA AATGATCGTC ACATACAATC CAATTGTTTG TGGTAAATAG GTAGATAGTA 9840 

AAATAATAAG CAAAAATATT CCAAATTGTC TTAGTTTTTT TGTGTTTCTC ATCGTACTTT 9900 

TTTGAAAGAT TACCCTGCTC GGAAGCCGTA CTTCCAAGCA TCTATATAAG AATTAAGTGC 9960 

CCCTTGCCTC ATATAGGGAG CAAATTCTCT ATAATATAAC CATCTACTAT ATCCATCTTC 10020 

CCAAACAGCA AGACCACCTG AAGTTTGCTC CAAGTCCTCA GTTGAAAGAA CTGTAAATGT 10080 

ATTTGTACCT GTCATTGCAA GTACCTTCTT AAAATAGATT GTTGTAGGCT CACATTTATA 10140 

GTATATTTCT TTTTTTGTCT ATTTTATAGC CCATCTCCTC AACTGGCAAT TTTTCGACCT 10200 

GAATTACATT TTTCCATAAA AAATGAGACC TTTCTAGTCT CATTTAGTCA TTCTTAGTAT 10260 

TTTCTAAATC GTTGATAGCG TTCTTCCAGC AACTCTTCTA GCGGTTTTTG TGAAAGTCTA 10320 

GCCAGCTCCG TTTGGAGTTC TTTTTTGACA CTCTTAATCA GTTCTTTACT AGAAAGTCCT 10380 

ATTTCAGAAA TCACCTTATC CACCACGTCC ATTTCTAACA GTTCATGCGA AGTGATTTTC 10440 

ATCAGTTCTG CTGCTTCCAT AGCGCGAGTA CCGTCCTTCC ATAAAATGGA AGCAAAGCCT 10500 

TCTGGACTGA GAATGGCATA GATAGAATTT TCCAGCATCC AGACACGGTC CGCGACAGCT 10560 

AGAGCCAGAG CCCCGCCTGA ACCACCTTCA CCGATAATAA TGGCGATAAT AGGAACTTTC 10620 

AGGTCACTCA TTTCCATGAG ATTGCGAGCG ATAGCTTCCC CTTGACCACG TTCTTCCGCT 10680 

CCGACACCAG GATAAGCACC TGCTGTATTG ATAAAGGTCA CAACTGGACG GCCAAATTTC 10740 

TCAGCCTGTT TCATCAACCG CAGTGCCTTT CGGTAGCCTT CTGGATGTGG TTGGCCAAAA 10800 

TTCCGTTTGA GGTTGTCTTG CAAACTCTTG CCTTTTTGGA TACCAACCAC TGTTACAGCT 10860 

TGGTCTCCAA GCCAACCAAT ACCACCAACA ACTGCACCAT CATCACGAAA AGAACGGTCA 10920 

CCATGTAATT GGATAAATTC ATCAAAAATG CCTGTCGCAA AGTCCAAGGT TGTCAAGCGA 10980 

CTCTGCTCAC GCGCTTCTCT GACTATTTTT GCAATATTCA TCTAGGACTC CCTCCATGCA 11040 

ATCTGACTAG GCTAGCAATC GTATCTGGTA AGTCTCTTCT TTTGACAATA GCATCCACAA 11100 

AGCCATGTTC TAATAGGAAT TCTGCCTTTT GGAAATCCTC AGGCAAGCTT TCACGAACCG 11160 
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TATTTTCAAT CACACGACGC CCAGCAAAAC CAACCAAGCT CTGTGGTTCA GCCAGAATGA 11220 

TATCGCCTTC CATAGCGAAA GAAGCTGTCA CACCACCAGT CGTTGGATCT GTCAAAATGG 11280 

TCAGGTAAAA GAGACCAGCA TTTGAATGGC GTTTAACCGC CGCAGAGATC TTAGCCATCT 11340 

GCATGAGACT CATGATTCCT TCCTGCATAC GGGCTCCACC AGAGGCTGTG AATAGGACAA 11400 

CTGGCAATTT TTCGACAGTC GCATACTCAA ACAAACGAGT GATTTTTTCA CCTACAACCG 11460 

TACCCATAGA AGCCATGATA AAGTTAGAAT CCATAATCCG AAGAGCCACA GTCTGACCTT 11520 

TAATAAGAGC AGTTCCTGTC ACAACGGCTT CATGCAGACC TCTTTTTTCA CGCATAGATG 11580 

CCAGTTTCTT TTGGTAACCA GGGAAATGGA AGGGATCCTT GCTTTCAATC CCTGTAAACA 11640 

ATTCTTTGAA GGTTCCCATA TCAATCGTCA AAGCCAAGCG TTCTTGGGCA GAAATACGAA 11700 

AGGTATAGCT ACAGTGCGGA CAGATACGTT CACTTCCCAG ATCCTTCTGA TAGATGGTAT 11760 

GCTTACAGCC TGGACACTGG GAAAATAATT CATCTGGAAC CTCTGGCTTA GCTTGAGGTT 11820 

TTTCCCTAAC CGAACGATTG GGATTGATTC GAATATACTT ATCTTTTTTA CTAAATAGAG 11880 

CCATTGATTC CCCTTTTCGG TTTAAACTCT TAAAGTCATT TTATTCTTTT TCTTGATATT 11940 

TAGGTAAGAA GGTTTCCATC AAGAAGGAAG TATCATAATC CCCAGCAATG ACATTGCGAT 12000 

CTGAAATGAG GTCAAGCTGG AAATCTGCAT TGGTCTGCAC TCCTTCAATT TCTAATTCAT 12060 

AGAGGGCACG TTGCATTTTC ATCAAGGCGT CAAAACGATT TTCGCCGTGT ACTATGATTT 12120 

TGGCAATCAT ACTATCATAA TAAGGCGGAA TGGTATAACC TGGATAAACT GCTGAATCCA 12180 

CGCGCAAGCC AACTCCACCA CTTGGCAGAT AGAGATTAGT AATCTTACCT GGACTTGGAG 12240 

CAAAGTTAAA GGCTGGGTTT TCTGCATTGA TACGACACTC GATGGCATGA CCGCGTAGGA 12300 

CAATATCTTC TTGCTTAACA GACAAAGGCT GACCTGCCGC AATGCAAATC TGTTCCTTAA 12360 

CGATATCAAC ACCTGAAACA AACTCTGTTA CTGGATGTTC TACCTGAACA CGAGTATTCA 12420 

TCTCCATGAA ATAGAAATTG CTACTTGCTT CATCAAGAAG AAATTCAATG GTTCCTGCAT 12480 

TCTCATAGCC AACAAACTCT GCCGCTCGAA CAGCAGCAGC ACCTATTTCA TGACGCAGCG 12540 

TTTTTCCGAT TGCAATCGAG GGACTTTCTT CCAAAACCTT TTGGTTATTC CTTTGAAGAG 12600 

AACAATCCCG TTCACCCAAG TGAATCACAT GTCCATGCTC ATCACCTAGG ATTTGAACCT 12660 

CAATGTGCCG AGCTGGATAG ATAACCCGTT CTATGTACAT GGCACCATTG CCATAATTGG 12720 

CCTTGGCCTC ACTAGAGGCA GTTTCAAAGG CAGAAACGAG GTCATCTGGT TTTTCAACCT 12780 

TACGAATCCC TTTACCACCT CCACCTGCTG AAGCCTTGAG CATAACAGGA TAGCCAATTT 12840 

TTTCAGCAAC AATCAAAGCT TCTTCAGAGT TATGCACTTC TCCATCTGAA CCTGGTATAA 12900 
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CAGGCACACC TGCTTTAATC ATCTGAGCAC GCGCATTGAT CTTATCCCCC ATCATATCCA 12960 

TAACATGACC AGATGGACCG ATAAACTTGA TACCTACTTC TTCACACATG GTCGCAAATT 13020 

TGGAATTTTC ACTGAGAAAT CCAAAACCAG GGTGAATAGC TTCTGCCTCA GTCAAGACTG 13080 

CAGCTGATAG AACTGCATTA ATATTGAGAT AAGACTCTGT TGCCTTGCCA GGACCAATAC 13140 

AAACTGCTTC ATCTGCCAAA AGCGTATGAA GAGCTTCCTT ATCAGCAGTT GAATAAACCG ' 13200 

CTACCGTCGC AATCCCCAAT TCACGTGCCG CACGGATAAT ACGAACCGCA ATTTCACCAC 13260 

GATTGGCAAT TAAAATTTTT CGAAACATGG AGAACCTCCT TAGTTCCCAA TTGCAAAAGT 13320 

AAGGGTACCA CTGGCTGCAA GCTTGCCATC CACTTCAGCC TTTGCTTCAA CCACAGCTAT 13380 

GGTGCCACGA CGTTTTACAA AAGTCGCTGT CATAACCAAT TGGTCGCCTG GTACAACTTG 13440 

CTTCTTGAAC TTAACCTTGT CCATACCAGC GTAAAAGACC AGTTTTCCTT TATTTTCAGG 13500 

TTTTGATAAC TCCAACACAC CGGCAGTTTG CGCCAAGGCT TCGATAATCA CAACACCTGG 13560 

CATAACTGGG TATTGAGGAA AGTGGCCGTT AAAGAAAGGC TCGTTGATGG TCACATTTTT 13620 

GATAGCAACA ATGGTATCCT CGCTCACTTC CAAGACACGG TCCACTAGAA GCATAGGATA 13680 

ACGGTGGGGA AGAGCTTCTT TGATTCCTTG AATATCGATC ATTTGATACG TACCAATCCT 13740 

TTACCAAACT CAACCATTTC TTCGTTAGAG ACGAGAATTT CCGTTACCAC ACCATCCTTA 13800 

GGAGCTGGGA TTTCATTCAT GACTTTCATG GCTTCGATAA TTACCAATGT TTGACCTTTT 13860 

TTGACACTAT CACCAACTGT AACGAAGGCA GGTTTATCTG GTCCAGCAGC CAAGTAAACC 13920 

ACTCCAACAA GTGGACTCTC TACAAGATTT CCCTCAGTAG CCACACTTGC TTCAGCTGGA 13980 

GCTGGAACTT CTTCTGCTAC AGTCTCTGCT GGAGCAGATG TAGGAGCTAC TGGACTCGGT 14040 

GTTGCTAGAA CGGGTGCTGG AGCGACTTGA GTTGCAACTT CAGGCACAGG TCTTGCTTCA 14100 

TTCTTGCTAA ACTGCAACTC ATCCGTCCCA TTTTTATAAG AAAATTCTCT CAAACTTGAC 14160 

TGGTCAAATT GAGTCATCAA GTCTTTAATA TCGTTTAAAT TCATACTTAT CTATTCTCCC 14220 

AACGTTTGAA AGCAAGAACT GCATTGTGGC CTCCAAAACC AAAAGTATTT GAAATAGCGT 14280 

ATGGAATTTC TTTCTCCAAG CCTTGTCCAT AAACGACATT AGCTTCGATA TAATCTGATA 14340 

CTTCACTTGT CCCAGCTGTC ATTGGTACAA AGTTATGACG CATAGCTTCG ATGGTGACGA 14400 

TAGCTTCTAC TGCACCCGCA GCCCCCAGCA AATGTCCTGT AAAAGACTTG GTTGATGATA 14460 

CAGGTACTTC CTTACCAAGA ACAGCTACGA TAGCACCACT TTCTCCTTTT TCATTGGCAG 14520 

GAGTTGACGT TCCGTGAGCA TTGACATAGG CTACTTGCTC TGGAGAAATC TCAGCTTCTT 14580 

CCAAGGCTAG TTTGATGGCC TTGATAGCTC CCTGACCTTC TGGATGTGGA GAAGTCATGT 14640 

GGTAGGCATC ACAAGTATTT CCGTAACCAA CCACTTCAGC CAGGATAGTA GCTCCACGTT 14700 
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TTTCAGCGTG 


TTCAAGACTT TCTAGAACCA ACATCCCTGA ACCTTCACCC ATAACAAACC 


14760 


CATTGCGATC 


CTTATCAAAT 


GGGATCGAAG CACGAGTTGG 


ATCCTCTGTA 


GTAGAGAGAG 


14820 


CTGTTAAGGC 


TTGGAAACCA 


GCGATGGCAA AAGGTGTGAT 


AGAAGCTTCT 


GTTCCTCCCA 


14880 


CCAACATCAC 


ATCTTGGAAA 


CCAAACTTAA TGGAGCGGAA 


GGCATGCCCA 


ATCGCATGAT 


14940 


TTGATGAAGA 


GCAGGCAGTA 


TTGATAGATT TACAAACACC 


GTTTGCACCA AAACGCATGG 


15000 


CTACATTCCC 


AGAAGCCATA TTTGGTAAAG CTTTTGGAAG AGTCATTGGT TTGACACGTT 


15060 


TGGGTCCTTT 


TTCATGAAGG 


CGAAGTACCT GATCTTCAAT 


TTCCTTGATT 


CCACCAATAC 


15120 


CAGATGCAAC 


GATAACACCA 


AAACGATCCC TATTAAGAGC 


CTCTACATCA 


AGATTGGCAT 


15180 


GATTTACAGC 


GTCTTGGGCT 


GCATACAAGG GATATAAAGA 


ATAGTTATGA 


AAACGGTTGG 


15240 


TATCTTTTTT 


TACAAAGTAT 


TTATCGAACG GAAAATCTTG 


GATTTCTGCC 


GCATTATGCA 


15300 


CATCAAAGTC 


ACTATGATCA 


AATTTTGTAA TGCCACCAAT 


GCCGATTTTC 


CCAGTTGCTA 


15360 


AACTATTCCA AAATTCTTCT GGTGTATTTC CGATTGGAGA TGTTACTGCA TAACCTGTTA 


15420 


CCACTACTCG 


ATTT AGTTT C 


ATTCTTTTCA CCTCTAGCTT 


TCGCTACATA 


CTTAAGCCAC 


15480 


CATCAATGGC 


AACCACTTGT 


CCAGTTAGAT AATCTTGGCC 


TGCTAAAAAT 


ACTGTCAAAT 


15540 


CTGCAACCTG 


CTCTGCCTGC 


CCAAATTCTT TCATCGGAAT 


CTGAGCTAGT 


GTAGCTTCCT 


15600 


TAATCTTATC 


TGACAGGATA 


GCGGTCATAT CAGACTCAAT 


CATTCCTGGA 


GCAATCACAT 


15660 


TGACTCGTAT 


ATTCCGACTA 


GCGACCTCGC GTGCCACAGA 


CTTGGTAAAG 


GCAATCAAGG 


15720 


CAGCCTTAGA 


AGCAGCATAA 


TTAGCTTGAC CAATATTCCC 


CATCAAACCA 


ACAACAGTAG 


15780 


ACATATTAAT 


GATAGCACCT 


TCTCTGGCTT TCATCATCGG 


TTTCAAGACT 


GATTGTGTCA 


15840 


TATTAAAGGC 


ACCAGTCAGA 


TTGACCTTGA GCACTTTTTC 


AAAATCTGGT 


TCTGTCATCT 


15900 


TGAGCATAAG 


AGTATCTTGG 


GTAATCCCTG CATTGTTGAC 


CAAAACATCT 


ACTGAACCCA 


15960 


GTTCTGCAAT AGCTTGATCA ATCATACGCT TAGCGTCTGG AAAATGTGAT ACATCTCGTG 


16020 


AAATGGGAAC 


CACCTTGATA 


CCATAGTTTG AAAACTCAGC 


GAGCAATTGT 


TCTGAGATTG 


16080 


CCCCACGACT 


GTTTAAGACA 


ATGTTGGCTC CTGCTTGAGC 


AAACTTGTGG 


GCGATGGCAA 


16140 


GACCAATTCC 


ACGACTCGAA 


CCTGTAATAA AGATATTTTT 


ATGTTCTAGT 


TTCATTTTTT 


16200 


TCCTTTCAAA ACTTCTACTT ATTTTAGTCT ATTTTTCTAA AAGTGCTACT AAACTCGCTT 


16260 


GATCTTCCAC 


ATGAGCTAAG 


TGAGCAGTTT GATCAATTTT 


TTTAACAAAA 


CCTGACAAGA 


16320 


CTTTCCCOGG 


TCCAATCTCG 


ATAAAGTTGC TTATGCGTGG 


TTGTTGCATG 


ACCCCAATAC 


16380 


TTTCATAGAA ACGAACGGGT TCGTTGACCT GACGCGTCAA GAGGTGAGCA ATGTGGTCTT 


16440 
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TTTGCATCAC AGCAGCTTCT GTATTGCCGA CTAGGGGACA AGTAAAATCT 


GAAAAACTTA 


16500 


CCTGAGCTAG AGTTTCAGCT AGTTTCTGGC TAGCAGGTTC 


AAGGAGAGCG 


GTGTGAAAGG 


16560 


G AC CTGACAC CTTAAGAGGA ATCAAGCGTT fcxv&rrrpm 


TTCTTGCAAA 


AGTTCAACCG 


16620 


CTCGATCAAC TGCAACCACT TCTCCAGCAA TYSaPfiATTTV^ 


TGCAGGTGTG 


TTATAGTTGG 


16680 


CTGGAGTAAC CACTCCAAGT TCAGAAfifTT tt^tv* A r* a arm 


TTCTTCAATG 


ACCTCTACTG 


16740 


GCGTATTGAG AACTGCPAPC ATPT^Tfyrin ir^irnarr 


AGCCGCTTCT 


TCCATATAGG 


16800 




CAAGGCGCCA CTTGCCACCA 


16860 


AGGCAGAGTA TWrfCJ Af: A ^ , a^ , Ma^ , ^»a/ , r^npoknikfii^ 
" wjrv -"**"« A «* n\ltuu\\jh uALAAALunu UVACCATATC 


AGGCTGATAG 


CCCTTTTCTT 


16920 


\a 1 AWi 1 A^Wn At- LAj AAU I C U CTAGAATGGC 


TGGTTGCGTA 


TAGCGGGTCT 


16980 


vsrti ivMiv^i 1 1 u iT- 1 1a_ 1 b i A I uuATGA GATAACGCAA 


ATCATAACCG 


AGCACCTGGC 


17040 


x i i u /uv i liiLl I 1 AALAATCG GATACTGATC 


ATAGAAATCC 


CGTCCCATCC 


17100 


CTAGATACTG GGCACCTTGA CCAGCAAATA AAAAGGCTGT 


TTTAGTCATT 


TCTTACAACT 


17160 


CCTGTCCAGC GAGAGGCTTC TTCTTGAATT TTCTTAGCGG 


CTCCGTAATA 


CAAATCTTTT 


17220 


AGGATTTCTT CAGCTGTTTC TTCTTTAGAA ACAAGCCCTG 


CGATTTGACC 


TGCCATAACA 


17280 


GAGCCACCAT CCACATCACC GTGAACAACT GCTTTGGCTA 


GAGCACCTGC 


TCCCATTTGT 


17340 


TCAAAGATTT CTAAATCAGG ATCTTCTTGC TTAAAGGCAT 


CTTTTTCAGC 


CAGTTCAAAA 


17400 


TCTCTAGTCA ACTGATTTTT AATAGCACGA ACAGCATGAC 


CAAAGTGCTG 


AGCTGAAATC 


17460 


GTAGTATCAA TATCCCTTGC TTTTAAAATT TTCTCCTTGT 


AGTTTGGATG 


GGCATTCGAC 


17520 


TCTTTTGCAA CTACAAACCG TGTCCCCACC TGTACAGCCT 


CTGCACCTAG 


CATAAAGCCA 


17580 


GCCGCAGCAC CTTCACCATC CGCAATTCCT CCTGCAGCAA 


TAACAGGAAT 


AGATATAGCT 


17640 


GTGGCTACCT GTCGCACCAA GGTCATGGTT GTTAATTTAC CGATATGCCC CCCAGCTTCC 


17700 


ATTCCTTCTG CAATAACAGC GTCTGCACCG ATTTTTTCCA 


TGCGTTTAGC 


TAAAGCGACA 


17760 


CTAGGAACAA CAGGAATAAC GATTATCCCA GCTTCATGGA 


AACGTTCCAT 


ATACTTGCTT 


17820 


GGATTTCCTG CTCCTGTTGT GACAACTTTA ACACCTTCTT 


CAATAACGAG 


ATCCACGATG 


17880 


TCTTCCACAA AGGGAGATAA GAGCATGATG TTGACCCCAA 


AGGGTTTATC 


AGTCAATGAT 


17940 


TTGATTTTAT CAATATTGGC CTTGACAACT TCTTTCGGGG 


CATTTCCCCC 


ACCGATAATT 


18000 


CCTAATCCTC CAGCCTTGGA AACAGCCCCT GCCAAATCAC 


CATCAGCAAC 


CCAGGCCATC 


18060 


CCTCCTTGGA AAATAGGATA ATCAATCTTC AATAATTCTG TAATACGCGT 


TTTCATAGTG 


18120 


CCTCCAACCT TCCTTGCTTA CGTAATAGTT CGATTTCACC 


ATAATTTGAC 


AGTCAAACTA 


18180 


TTACCTAAAC AAGAGGGAGT GGGTTTCTCC CTACTCCTTC 


TACTAATATT 


CTGCTTATTT 


18240 
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TGCTTGCTCT 


TCAACGTAAG 


CAACCAAGTC ACCAACTGTT 


TTCAAGTCAT 


TTTCTGCTTC 


18300 


GATTTGGATA 


TCAAAAGCAT 


CTTCGATTTC 


TGAGATTACT 


TGGAACAAGT 


CCAATGAATC 


18360 


TGCGTCCAAA 


TCATCAAAAG 


TTGATTCAAG 


TGTTACTTCT 


GATGCGTCTT 


TTCCAAGTTC 


16420 


TTCAACGATA 


ATTTCTTGTA 


CTTTTTCAAA 


TACTGCCATG 


ATAGGACTCC 


TTTAAAATAA 


18480 


ATAGTTTTTT 


TATAACAATG 


TGTTCACCAC 


ATGATTACCT 


AAATTGTAAG 


AATGAGCGTG 


18540 


CCCCAGGTCA 


AGCCTCCACC 


GAAGCCTGAT 


AGAAGAACAG 


TCTGGCTACC 


ATCTAAAGGG 


18600 


ATGAGACCTT 


GTTCTACACA 


CTCTGAAAGT 


AAAATCGGGA 


TACTGGCTGC 


ACTGGTATTG 


18660 


CCATATTCCA 


TCATATTGGC 


TGGAAGTTTG 


GCTCGGTCAA 


CACCAATTTT 


TCTAGCCATG 


18720 


TTATCCAAAA 


TACGGTCATT 


GGCTTGATGA 


AGTAGCAGAT 


AATCCAAGTC 


TGTCACCTCT 


18780 


ATAGGAGATT 


CATCAATAGT 


CTGCTTGATA 


GACTTGGCTA 


CATCTCGAAT 


GGCAAAATCA 


18840 


AAGACTGTGC 


GTCCATCCAT 


CTTCAAAAAC 


GAATCTGCAC 


TTTCTTGATC 


TGAAAATGGA 


18900 


GAATGTAAAC 


CTGAATGCCC 


ATAAGTTAAA 


CACTCGCTGC 


GACTTCCATC 


GCTATTGAGA 


18960 


CTCTCAGCTA 


AGAAATGCTC 


TTGCTCGCTA 


GCTTCTAACA 


AGACACCACC 


AGCACCATCT 


19020 


CCAAACAACA 


CAGCTGTTGA 


TCGATCCGAC 


CAATCGACTG 


CCTTAGAGAG 


GGTTTCACTA 


19080 


CCAATCACCA 


AGCCTTTTTG 


AAAGCGACCA 


GAAGCGATAA 


ACTTTTCAGC 


AGTTGAAAGA 


19140 


GCAAATACAA 


ATCCACTGCA 


AGCCGCGGTT 


AAGTCAAAAG 


CAAAGGCTTT 


ATTAGCACCA 


19200 


ATATTAGCTT 


GAACACGAGC 


AGCTGTAGAG 


GGCATCATCG 


AATCTGGAGT 


AATGGTAGCT 


19260 


AGGATGATAA 


AATCCAGTTC 


TTCTCCTGTT 


ATTCCAGCTT 


TTGCCATCAG 


TTTCTTAGCA 


19320 


ACCTCTGTAG 


CCAAATCACT 


GGTAGATTCT 


GTTCTTGAAA 


TATGCCTTTG 


TCGTATTCCC 


19380 


GTTCGACTTG 


AAATCCACTC 


ATCATTGGTA 


TCCATAATCT 


GAGCCAAGTC 


GTGATTTGTA 


19440 


ACCACTTGCT 


CTGGCACATA 


ATGAGCAACC 


TGACTTATTT 


TTGCAAAAGC 


CATTATTTCA 


19500 


AATCCTCCAA 


AAATTGGTAA 


AGATTAGTCA 


AACCTTTACC 


CATGACAGCA 


ATTTCTTCCT 


19560 


CGCTCATGCC 


ATCAATAATT 


TTTTCTACCA 


TGGCCTTGTG 


GAAGCGTTTA 


TGCAGTCTAT 


19620 


GAATCAAGCG 


ACCCTTCTTT 


GTCAAATGCA 


GATGCACCAC 


ACGACGATCC 


TGTTCTGACC 


19680 


GAACTCGCTC 


AATGTAGCCC 


GG 








19702 


(2) INFORMATION FOR SEQ ID NO: 8: 











<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6211 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GAAAATTTCC TCTCTTCTCT TGAAAAATTT TGAAAAAATG GTATGATAGT AACAAGTTAT 60 

TTTTAAGAGG AAAGAAAGGG GAATAATGGA GAAAATCAGT TTAGAATCTC CTAAGACGGG 120 

GTCGGACCTA GTTTTGGAAA CACTTCGTGA TTTAGGAGTT GATACCATCT TTGGTTATCC 180 

TGGTGGTGCG GTTTTGCCTT TTTATGATGC GATATATAAT TTTAAAGGCA TTCGCCACAT 240 

TCTAGGGCGC CATGAGCAAG GTTGTTTGCA TGAAGCTGAA GGTTATGCCA AATCAACTGG 300 

AAAGTTGGGT GTTGCCGTCG TCACTAGTGG ACCAGGAGCA ACAAATGCCA TTACAGGGAT 360 

TGCGGATGCC ATGAGCGATA GCGTTCCCCT TTTGGTCTTT ACAGGTCAGG TGGCGCGAGC 420 

AGGGATTGGG AAGGATGCCT TTCAGGAGGC AGACATCGTG GGAATTACCA TGCCAATCAC 480 

TAAGTACAAT TACCAAGTTC GTGAGACAGC TGATATTCCG CGTATCATTA CGGAAGCTGT 540 

CCATATCGCA ACTACAGGCC GTCCAGGGCC AGTTGTAATT GACCTACCAA AAGACATATC 600 

TGCTTTAGAA ACAGACTTCA TTTATTCACC AGAAGTGAAT TTACCAAGTT ATCAGCCGAC 660 

TCTTGAGCCG AATGATATGC AAATCAAGAA AATCTTGAAG CAATTGTCCA AGGCTAAAAA 720 

GCCAGTCTTG TTAGCTGGTG GTGGAATTAG TTATGCTGAG GCTGCTACGG AACTAAATGA 780 

ATTTGCAGAA CGCTATCAAA TTCCAGTGGT AACCAGTCTT TTGGGAC AAG GAACGATTGC 840 

AACGAGTCAC CCACTCTTTC TTGGAATGGG AGGCATGCAC GGGTCATTCG CAGCAAATAT 900 

TGCTATGACG GAAGCGGACT TTATGATTAG TATTGGTTCT CGTTTCGATG ACCGTTTGAC 960 

GGGGAATCCT AAGACTTTCG CTAAGAATGC TAAGGTTGCC CACATTGATA TTGACCCAGC 1020 

TGAGATTGGC AAGATTATCA GTGCAGACAT TCCTGTAGTT GGAGATGCTA AGAAGGCCTT 1080 

GCAAATGTTG CTAGCAGAAC CAACAGTTCA CAACAACACT GAAAAGTGGA TTGAGAAAGT 1140 

CACTAAAGAC AAGAATCGTG TTCGTTCTTA TGATAAGAAA GAGCGTGTGG TTCAACCGCA 1200 

AGCAGTTATT GAACGAATTG GTGAATTGAC GAATGGAGAT GCCATTGTGG TAACAGACG'T 1260 

TGGTCAACAC CAAATGTGGA CAGCTCAGTA TTATCCCTAC CAAAATGAAC GTCAGTTAGT 1320 

GACTTCAGGT GGTTTGGGAA CAATGGGCTT TGGAATTCCA GCAGCAATCG GTGCTAAAAT 1380 

TGCTAACCCA GATAAGGAAG TAGTCTTGTT TGTTGGGGAT GGTGGTTTCC AAATGACCAA 1440 

CCAGGAGTTG GCTATTTTGA ATATTTACAA GGTGCCAATC AAGGTGGTTA TGCTGAACAA 1500 

TCATTCACTT GGAATGGTTC GCCAGTGGCA GGAATCCTTC TATGAAGGCA GAACATCAGA 1560 

GTCGGTCTTT GATACCCTTC CTGATTTCCA ATTGATGGCG CAGGCTTATG GTATTAAAAA 1620 

CTATAAGTTT GACAATCCTG AGACCTTGGC TCAAGACCTT GAAGTCATCA CTGAGGATGT 1680 
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TCCTATGCTA ATTGAGGTAG ATATTTCTCG TAAGGAACAG GTGTTACCAA TGGTACCGGC 1740 

TGGTAAGAGT AATCATGAGA TGTTGGGGGT GCAGTTCCAT GCGTAGAATG TTAACAGCAA 1800 

AACTACAAAA TCGTTCAGGA GTCCTCAATC GCTTTACAGG TGTCCTATCT CGTCGTCAGG 1860 

TTAATATTGA AAGCATCTCT GTTGGAGCAA CAGAAGATCC GAATGTATCG CGTATCACTA 1920 

TTATTATTGA TGTTGCTTCT CATGATGAAG TGGAGCAAAT CATCAAACAG CTCAATCGTC 1980 

AGATTGATGT GATTCGCATT CGAGATATTA CAGACAAGCG TCATTTGGAG CGCGAGGTGA 2040 

TTTTGGTTAA GATGTCAGCG CCAGCTGAGA AGAGAGCTGA GATTTTAGCG ATTATTCAAC 2100 

CTTTCCGTGC AACAGTAGTA GACGTAGCGC CAAGCTCGAT TACCATTCAG ATGACGGGAA 2160 

ATGCAGAAAA GAGCGAAGCC CTATTGCGAG TCATTCGCCC ATACGGTATT CGCAATATTG 2220 

CTCGAACGGG TGCAACTGGA TTTACCCGCG ATTAAAAATC CAACTTAAAT TTATTAAACC 2280 

AGCCTAAAAG GCAATAAATA ATAGAAAAGA GAGAAAAGCT ATGACAGTTC AAATGGAATA 2340 

TGAAAAAGAT GTTAAAGTAG CAGCACTTGA CGGTAAAAAA ATCGCCGTTA TCGGTTATGG 2400 

TTCACAAGGG CATGCGCATG CTCAAAACTT GCGTGATTCA GGTCGTGACG TTATTATCGG 2460 

TGTACGTCCA GGTAAATCTT TTGATAAAGC AAAAGAAGAT GGATTTGATA CTTACACAGT 2520 

AGCAGAAGCT ACTAAGTTGG CTGATGTTAT CATGATCTTG GCGCCAGACG AAATTCAACA 2580 

AGAATTGTAC GAAGCAGAAA TCGCTCCAAA CTTGGAAGCT GGAAACGCAG TTGGATTTGC 2640 

CCATGGTTTC AAGATCCACT TTGAATTTAT CAAAGTTCCT GCGGATGTAG ATGTCTTCAT 2700 

GTGTGCTCCT AAAGGACCAG GACACTTGGT ACGTCGTACT TACGAAGAAG GATTTGGTGT 2760 

TCCAGCTCTT TATGCAGTAT ACCAAGATGC AACAGGAAAT GCTAAAAACA TTGCTATGGA 2820 

CTGGTGTAAA GGTGTTGGAG CGGCTCGTGT AGGTCTTCTT GAAACAACTT ACAAAGAAGA 2880 

AACTGAAGAA GATTTGTTTG GTGAACAAGC TGTACTTTGT GGTGGTTTGA CTGCCCTTAT 2940 

CGAAGCAGGT TTCGAAGTCT TGACAGAAGC AGGTTACGCT CCAGAATTGG CTTACTTTGA 3000 

AGTTCTTCAC GAAATGAAAT TGATCGTTGA CTTGATCTAC GAAGGTGGAT TCAAGAAAAT 3060 

GCGTCAATCT ATTTCAAACA CTGCTGAATA CGGTGACTAT GTATCAGGTC CACGTGTAAT 3120 

CACTGAACAA GTTAAAGAAA ATATGAAGGC TGTCTTGGCA GACATCCAAA ATGGTAAATT 3180 

TGCAAATGAC TTTGTAAATG ACTATAAAGC TGGACGTCCA AAATTGACTG CTTACCGTGA 3240 

ACAAGCAGCT AACCTTGAAA TTGAAAAAGT TGGTGCAGAA TTGCGTAAAG CAATGCCATT 3300 

CGTTGGTAAA AACGACGATG ATGCATTCAA AATCTATAAC TAATTAGAAA TATATAGCGC 3360 

TGGAGATGAT TTTATGAAAA AGATTATGAG AAAAATTGCA TCGTTATTAT TGGTTCTAGT 3420 
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TGTATAATGT AATTACACCG TCGGTAATAG TGCTAGCAGA CCAAAATAAA GCAGATTGGT 3480 

CGTATGATGA AAATGCTGTA ATTAACATTT ATGATGATGC TAATTTTGAA GATGGTAGGT 3540 

TGCATATGAA CTTTGAACAA TTCTTCAAAT TGGCACAAAT AGCTAGAGAA GAAGGTCTTG 3600 

AAATTCATTC TCCGTTTGAG AGAGCTGGTG CGACTAAATC TGCTCGTTAT ATAGCGAAAT 3660 

GGATTTTGAG AAATAAAAAA CATTAACAAA TATAGTTGGT AAATCATTAG GACCTAAATC 3720 

AGCTGTTAGA TTCGGAGAAG CTTTATCCTA TATTGAAGGT CCTCTTCGCA GAATAAATGA 3780 

GACGATAGAT GGCGGTTTAT ATCAAATAGA GCAAATTATT GCATCTGGAT TGAAAGAATC 3840 

GGGTTTAAAT GACTGGACTG CGAAAACTTT AGCTTCAGCT ATTCGTGGGA TATTAGATGT 3900 

ACTTATTTAG GGGTTGAAAT CATATGAATA TTACCAATTT GTTTTCTATC AAGACAGGAT 3960 

GTGATGAAAC TGATAGGCAA CTGCAAAAAC TATTTTTTCA GTTGGATTTA CAATTGGGAG 4020 

AATTGACAGA TCAACTAAGA AAATTAGATT CTAATTTTGT TCCTCGTAGT CAATTTGTAG 4080 

ACACGTTGGA TTTGAATGAT GTAGAATATA AAGAAATTTT AAACTATTTT ATCTTCCATC 4140 

GTAATGATAG TGAAGAAAGT TTGGTAGAAT GGTTATATGA TTGGATTTCC ACAAATCGTT 4200 

ATGAACTTCC TAAAGAGTTT TCGATTCGTA TGGCTCATAA ATACCATGAA AGTGTTACTG 4260 

AAGTTTTCGG AGATGAATAA CTAAAAAACA GTCATTAGTG ACTGTTTTTT ATAGAAAAAG 4320 

AGGTTTTATA TGTTAAGTTC AAAAGATATA ATCAAGGCTC ACAAGGTCTT GAACGGTGTG 4380 

GTTGTGAATA CTCCACTGGA TTACGATCAT TATTTATCGG AGAAGTATGG TGCTAAGATT 4440 

TATTTGAAAA AAGAAAATGC CCAGCGTGTT CGCTCCTTTA AAATTCGTGG TGCCTATTAT 4500 

GCCATTTCCC AGCTCAGCAA GGAAGAACGT GAACGTGGGG TAGTCTGCGC TTCTGCGGGA 4560 

AATCATGCGC AGGGAGTAGC CTATACTTGT AATGAAATGA AAATTCCTGC TACTATCTTT 4620 

ATGCCCATTA CTACGCCACA ACAAAAGATT GGTCAGGTTC GCTTTTTTGG TGGGGATTTT 4680 

GTAACTATTA AACTAGTTGG AGATACCTTT GATGCCTCAG CCAAAGCAGC TCAAGAATTT 4740 

ACAGTCTCTG AAAATCGTAC CTTTATTGAT CCTTTTGATG ATGCTCATGT TCAAGCAGGT 4800 

CAAGGAACAG TTGCTTATGA GATTTTAGAA GAAGCTCGAA AAGAATCGAT TGATTTTGAT 4860 

GCTGTCTTGG TTCCTGTTGG TGGTGGCGGT CTCATTGCCG GGGTTTCTAC CTATATCAAG 4920 

GAAACAAGTC CAGAGATTGA GGTTATCGGA GTAGAGGCGA ATGGAGCGCG TTCCATGAAA 4980 

GCTGCCTTTG AGGCTGGAGG TCCAGTAAAA CTCAAGGAAA TTGATAAATT TGCTGATGGG 5040 

ATTGCTGTGC AAAAGGTAGG TCAGTTGACC TATGAAGCAA CTCGTCAACA TATTAAAACT 5100 

TTGGTAGGTG TCGATGAGGG ATTGATTTCT GAAACCTTGA TTGACCTTTA CTCTAAGCAA 5160 

GGGATAGTCG CAGAACCTGC TGGAGCGGCT AGTATCGCCT CTTTAGAGGT TTTAGCTGAA 5220 



WO 98/18931 



PCT/US97/19588 



209 



TATATTAAGG 


GGAAAACCAT 


TTGTTGTATC ATTTCTGGAG GAAATAATGA TATCAACCGT 


5280 


ATGCCAGAAA 


TGGAAGAGCG 


TGCCTTGATT 


TATGATGGTA 


TCAAACATTA 


CTTTGTGGTC 


5340 


AATTTCCCAC 


AACGTCCAGG 


AGCTTTGCGT 


GAGTTTGTAA 


ATGATATCCT 


GGGGCCAAAT 


5400 


GATGATATCA 


CACGTTTTGA 


GTATATCAAA CGAGCTAGCA AGGGAACAGG CCCAGTATTA 


5460 


ATTGGGATCG 


CTTTAGCAGA 


TAAGCATGAT 


TATGCAGGTT 


TGATTCGTAG 


AATGGAAGGT 


5520 


TTTGATCCAG 


CTTATATTAA 


CTTAAATGGT AATGAAACGC TTTATAATAT GCTTGTCTGA 


5580 


GGACTAATAA 


AAAAATATCA 


TACCTTCATT TTGATTTCCT ATCTATTGAG AAGCATAGTG 


5640 


ACACTGTCTT 


TAATACTCTT 


CGAAAATCTC 


TTCAAACCAC 


GTTAGCTCTA 


TCTGCAACCT 


5700 


CAAAACAGTG 


TTTTGAGCAA 


CTTGCGGCTA 


GCTTCCTAGT 


TTGCTCTTTG 


ATTTTCATTG 


5760 


AGTATAAGGT 


ATGATTTGAT 


TTCTTTTTGT 


TGACAAATAT 


ACTATATTAA 


AAAGATATAT 


5820 


AAGTAATTAA 


CTGAGCTTAT 


CTGTCTTGTC 


ATCTCTATTA 


AGGATGGTTT 


AGATAATCGG 


5880 


GTGTCTGCTT 


CTAGGCTAGC 


ACCTCAATAT 


CCAAAGGAGT 


GATGAATTTG 


AAGGACATAA 


5940 


GGAATACCTA 


TCTCTCAGAT 


GATTTATTGA 


GGAAGAAAGA 


TAGGAGTTTT 


TGAGCTAGTG 


6000 


AAGGCTTGGA 


TTTCTAAAGG 


TTAGAACTAT 


CATCTTCAGT 


TCTTAAATCG 


AAGAAATAAG 


6060 


CTATCTTACG 


GAAATAGAGA 


AGCATTTTTT 


AAGAACTTGA 


ATAATTTCGC 


ACCTTAAGAG 


6120 


GGTAATAATA 


CAGTATTTTT 


ATTAGCAAAT 


ATTTATGGTG 


TAGAGGCTAG 


CAAAACCTAT 


6180 


ATATTATCGG 


ATTTAAAAAG 


GAAGTAAGAA 


A 






6211 



(2) INFORMATION FOR SEQ ID NO: 9: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7939 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

CCGGACTCCC CACGATTCTT CAAAATAACT GAGTATATTT CTATCTTGAT TTTCAGATAT 60 

AAATTCTTCC TTCTGTGGCC TCTTCTTACG CTTGAGAAGA GCTTCTCCGA CATGGCTTCT 120 

TCCTTACTGA GCAAAACCTT GAGCATAGAT AAGTTTGACT GGCAAGCGTG CTCTTGTATA 180 

TTTGGCTCCC TTCCCACTAT TGTGGATAGC GAGGCGTCTT CTCATATCAG TCGTATAGCC 240 

TATATAGTAG GATCCATCAC GACACTCCAG AACGTACATA TAAGCCTTAT GATCCATAAT 300 

AAATCTCTTC GATTTCGGGC GTATAAGAGC CATCATCATT GTGGACAATC AAAGGAGGTA 360 
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AGACCTTAAA GCCACTTGTT GAGCCATCCT TGATCGCCTC AATCAAAAGC ATATTGGCTT 420 

CCTTTTCTCT TTTTGGATAA ACAAACTGCA GGCGCTTAGG GGCTAGATTA TGTCGTTTTA 480 

ACGTATCCAA AATATCCAGA AGTCGATCAG GACGATGAAC CATGGCCAAA CGCCCATTAG 540 

ACTTGAGAAT ACTCTGGGCA CTACGACAGA TTTCTTCCAA ATTAGTCGTG ATTTCGTGTC 600 

GAGCCAAGAG ATAATGTTCA CTCTCGTTCA GATTAGAATA AGGATTCACC TTG AAATAGG 660 

GTGGATTACA CAAAATCATA TCCACCTTAC TCCCCTGAAT GTGAGCAGGC ATATTTTTCA 720 

AATCATCGCA GATGACCTGC ATTTGCTCCT CTAATCCATT CAAACGGACA GAGCGTTCAG 780 

CCATATCCGC CAAACGCTCC TGAATCTCAA CAGACAATAT CTGTGCTTGA GTACGAGTGC 840 

TAGCAAAAAG CCCCACTGCT CCATTCCCAG CACAGAAATC CACAATCAAC CCCTTCTTAG 900 

GAAAACGTGG AAATCGTGAT AAGAGAACAC TATCCACCGA ATAGCTAAAA ACCTCTCTAT 960 

TTTGAATGAT TTTGATATCT GTCGAAAAGA GCTGGTTAAT GCGCTCTCCT GATTTTAATA 1020 

ATTGTTCTTC TTCCATGGTC CTATTATAGC AAATTCATAT TAACATTACA AAAAATATAA 1080 

AACTCTAAAC TACTTCTTCT TTTTTAAATG GTGCAGGGCT TCTCCAGTCC AGATTGGTAG 1140 

CATTCGTCGA AAGGGAGCAA AGCCGTAGTT AAAGCGGTCG CTTGAAAAGC GTCTCCGTCT 1200 

AGGAAACTGG TACTTTTCTT CCTCCAAAGT GCGGATAGAA AGACTGGCTT TCCCTGTAAA 1260 

TTCATCTAAA TCCACTACCT GAACTTGAAC CTCTTCATCG ACTTTCAAGG TTTCATGAAT 1320 

ATTTTCAATA AATCCTGTCC GAATCTCTGA AATGTGAATC AGCCCCGTAT GACCCGTCTC 1380 

TAACTCAACA AAGGCACCGT AGGGCTGAAT CCCTGTAATA CGCCCCTTTA GCTTATCACC 1440 

GATTTTCATC TTAGTCCTCG ATTTCAATAG TTTCAATTAC AACATCTTCA ACTGGCTTGT 1500 

CCATAGCTCC TGTCTCAACA GCAGCAATGG CATCCAAGAC AGCGTAAGAT GCTTCATCAG 1560 

CTAACTGACC AAAAACCGTG TGACGGCGGT CTAGGTGAGG TGTCCCACCT TGATTGGCAT 1620 

AGATTTCTGC AATCGGTTCT GGCCAACCAC CACGAGTAAT TTCTTTCTTA GAATAAGGTA 1680 

GGTGTTGGTT TTGCACGATA AAGAACTGGC TGCCGTTGGT ATTTGGACCA GCATTTGCCA 1740 

TGGAAAGAGC ACCACGGATA TTGTAAAGCT CTTCTGAGAA TTCATCCTCA AAAGATTCGC 1800 

CGTAGATTGA CTCGCCACCC ATACCAGTTC CAGTTGGGTC TCCACCTTGG ATCATAAAGT I860 

CCTTGATAAT ACGGTGGAAA ATGACACCAT CATAGTAGCC ATCTTTTGAA AGAGATACAA 1920 

AGTTAGCCAC TGTTTTAGGA GCATGTTCAG GGAAAAGCTT GATACGTAAG TCTCCGTGAT 1980 

TGGTCTTAAT AGTCGCAAGA GGACCTTCTA CTGTTTCAAT GTCTACTTGT GGAAAATGCA 2040 

ATTCTTTTTC TACCATACCA AATACTTCTA AGGCAGCAAA AATGCCATCT TCTTCTAATG 2100 

TTTTTGTAAT ATAATCTGCT TTTTCTTTGA TTTTATCATG AGAAATTCCC ATGGCAACGC 2160 



WO 98/18931 



PCT/US97/19588 



211 



TGATTCCAGC 


ATAATCAAAG 


AGTTCCAAGT 


CGTTGAGACC 


ATCTCCAAAA 


ACCATGACCT 


2220 


TCTCTGGTTT 


CAAGCCAAGG 


TGTTCCACAA 


CCTTTTCCAC 


CCCCGTCGCT 


TTGGAGCCTG 


2260 


AAATCGGCAC AATATCAGAC GAATGTTGAT GCCAACGAAC 


CATGCGAAGT 


TTGTCTGAGA 


2340 


GACTGTCAGG CAAGTGCAAG TCATCTCCCT TATCTTCAAA 


AGTCCACATC 


TGATAGATAT 


2400 


CTTCTTTTTC 


ATGGAAATCG 


GGATCTACAT 


CTAAGTCGGG 


ATAAATTGGA 


TTGATAGCTT 


2460 


CACTCATCAT ATCGGTGCGA GTCGACAACT TGGCATCATG 


ACTCCCAACC 


AAGCCATACT 


2520 


CAATTCCTTC 


TTGCTTAGCC 


CAAGAGATAT 


ACTCCTCAAC 


ATCTGACTTT 


TCAATCTGAT 


2580 


GCTGATAAAT 


GACCTGACCT 


TTTTTATCTT 


CGATATAAGC 


CCCATTCAAA 


GTTACAAAAA 


2640 


AGTCAGGCTT 


GAGATCACGA 


ATCTCTGGAA 


CAACACCAAA 


AATGCCACGT 


CCAGAGGCGA 


2700 


TTCCTGTTAA 


AATTCCTTTT 


TCACGCAACT 


GTTTAAAAAC 


AGTGGGAATT 


GTAGTTGGAA 


2760 


TAAACCCTGT 


CTTTGAATTC 


CGCAATGTAT 


CATCAATATC 


AAAAAAGACA 


ATCTTGATCT 


2820 


TCTTTGCCTT 


GTATCTTAAT 


TTCGCGTCCA 


TCTCACTACC 


TCTTTGAATC 


TAACTCTTTC 


2880 


CATTATATCA 


TAAAGTAGGC 


AAATCCCCTA TTTTCAAAAA 


GTTTATCATT 


TTTATTTTAA 


2940 


TTTCTTGGAT 


GAGAAAAGAG 


ACATATTTAT 


GAAAAAGCTC 


CATCGTGCTT 


TTAATGTGTT 


3000 


CTCTTGTTTT 


CAAACTCGTA 


AAAAGGGAGC 


CACTGATCCT 


AACTCGCTCT 


CTCATTTCAA 


3060 


AGCTTGTGAA 


AAAAGACCCG 


TTGGGGTCTT 


AATTCGCTTT 


CTTGTTTTCA AGCTCATGAA 


3120 


AAAGAGACCC 


AACTGGGTCT 


TTTCTTTAAT 


CTTCGTTTAC 


GAAAGGCATC 


AAAGCCATTA 


3180 


CGCGAGCGCG 


TTTGATAGCT 


GTTGTTACTT 


TACGTTGGTT 


TTTAGCTGAA GTTCCTGTTA 


3240 


CACGACGAGG 


AAGGATTTTC 


CCACGTTCTG 


AAACGAAACG 


GCTAAGAAGC 


TCAGTATCTT 


3300 


TGTAATCAAC 


ATATTGAATT 


TTGTTTGCTG 


CGATGTAATC 


AACTTTTTTA 


CGGCGTTTGA 


3360 


ATCCGCCACG 


ACGTTGTTGA 


GCCATGTTTT 


TTCTCCTTTA 


TAAGTTTAGT 


TGTCCATTAG 


3420 


AATGGTAAAT 


CATCATCTGA 


AATATCCAAT 


GGGTTTGTTG 


CTCCAAATGG 


ATTTTCATTA 


3480 


CGTGAAAAGT 


CTGGTACTGA 


ATTTGTAGGT 


GCTGAATAGT 


TTGGAGTTGG 


TGCAGAGTAA 


3540 


GCTCCACCTG 


TGTGACCCTC 


ACGCACACTA 


CGGCTTTCCA 


ACATTTGGAA 


ATTCTCAGCC 


3600 


ACGACCTCTG 


TCACGTAGAC 


ACGTTGTCCT 


TGCTGGTTAT 


CGTAACTACG 


AGTCTGGATA 


3660 


CGACCTGTCA 


CCCCGATAAG 


TGAGCCTTTT 


TTAGCCCAGT 


TAGCAAGATT 


TTCAGCCTGT 


3720 


TGGCGCCACA 


TAACGACATT 


GATAAAATCA GCCTCACGTT 


CACCATTTTG 


ACTCTTAAAT 


. 3780 


GTAOGGTTTA 


CTGCAAGAGT 


AAAAGTCGCA ACTGCTAGAT 


TTGATGGGGT 


ATAAGGCAAC 


3840 


TCAGCGTCAC 


GTGTCATACG 


CCCTACAAGT 


ACAACATTGT 


TAATCATAGT 


TTACCTTCTT 


3900 
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ACGCGTCAAT TTTGACGATC ATGTGACGAA GAATGTCAGC GTTGATTTTT GAAAGACGGT 3960 

CAAACTCTTT AAGAGCTGCA TCGTCATTTG CTTCAACGTT AACGATGTGG TAAAGTCCTT 4020 

CACGGAAATC TTGGATTTCG TATGCAAGAC GACGTTTTTC CCAAGTTTTT GATTCAACAA 4080 

CAGTTGCACC GTTGTCAGTC AAAATAGAGT CAAAACGTGC TACCAAAGCG TTTTTAGCTT 4140 

CTTCTTCAAT GTTTGGACGA ATGATATAAA GAATTTCGTA TTTAGCCATT GATATGTTCC 4200 

TCCTTTTGGT CTAATGACCC CAAGACTTTG CAAGGGGTAA GTGAGGTTCG CTCACAATAA 4260 

ACTATTATAC TAGAAAAAAT TTTTTTACGC AAGTAAAAAC ACTAGAATTC GAAAAAACGC 4320 

CACATGGGCG TTTTCCTGTT CTTATGGTTT GATACGGTGC AACATACGTG GGAATGGAAT 4380 

AGCTTCACGG ATATGTTTTG TTCCTGCTGC GAAGGTTACC ATACGTTCGA TACCGATACC 4440 

AAATCCTCCG TGTGGAACTG TACCGTATTT ACGAAGGTCA AGGTAGAATT CATATTCTGT 4500 

ACGATCCATG CCAAGTTCAT CCATCTTAGC GACAAGGGCA TCGTAATCTT CCTCACGCAT 4560 

AGACCCACCG ATAATTTCTC CATAGCCTTC TGGAGCAAGC AAGTCTGCAC AAAGCACGCG 4 620 

CTCTGGATTT CCAGGAACTG GTTTCATGTA GAAGGCCTTG ATGGCTGCTG GATAGTTCAT 4680 

GACAAATGTT GGCACACCAA AGTGGTTTGA AATCCAAGTT TCGTGTGGTG ACCCAAAGTC 4740 

ATCACCATGC TCAAGATGCT CGTAGTCAGC ATCTTCATCA TTTTCATGCT CTTGCAAGAG 4800 

GTCAATGGCT TGATCGTAAG TGATACGTTT GAATGGCTCT GCAATGTAGC GTTTCAAGAG 4860 

TTCTGTATCA CGTTCCAAGG TTTCCAAGGC TTGAGGCGCG CGGTCAAGAA CACCTTGTAG 4920 

AAGAGCTTTC ACATAAGCTT CTTGCAAGTC AAGCGACTCA TCATGTGTCA AGTATGAGTA 4980 

CTCAGCATCC ATCATCCAGA ACTCAGTCAA GTGACGGCGT GTTTTTGATT TTTCAGCACG 5040 

GAAAACTGGA CCAAAGTCAA AGACACGACC AAGAGCCATA GCCCCTGCTT CTAGGTAAAG 5100 

CTGACCTGAT TGGCTCAAGT AGGCTGGCGT TCCGAAGTAG TCAGTTTCAA AGAGTTCTGT 5160 

AGAATCTTCT GCCGCATTTC CTGAAAGAAT TGGGCTGTCA AACTTCATAA AACCGTTCTT 5220 

GTCAAAGAAC TCATAAGTTG CATAGATAAT AGCGTTACGG ATTTGCAACA CAGCTACTTG 5280 

CTTACGAGAG CGTAgCCACA AGTGACGGTT ATCCATCAAA AAGTCTGTTC CGTGTTCTTT 5340 

TGGTGTGATT GGGTAGTCTT GAGATTCACC GATCACTTCG ATGTCTGTGA TGTCCAACTC 5400 

ATAGCCAAAT TTAGAACGTT CGTCCTCTTT GACAATACCT GTCACATAAA CAGACGTTTC 5460 

TTGGCTCAAG CGTTTGATAA CATCAAACTT CTCAAGTCCC ACTTCTTCAC CAAATTTTTC 5520 

GACAAAGTTT GGTTTAAAAG CCACACCTTG AAAGAAGGCT GTTCCATCAC GCAATTGTAA 5580 

GAAAGCGATT TTTCCTTTTC CTGATTTGTT GGCAACCCAA GCGCCAATCG TCACTTCCTG 5640 

ACCAACATAG TCTTTTACGT CAATAATCGT TACACGTTTT GTCATTATTT TTCCTTTTCT 5700 



WO 98/18931 



PCT/US97/19588 



213 



TTTTTATTCT 


TTATGGCAAA 


CCACCTCTAT 


ATTGTTCCCA 


TCCAGGTCAA 


TCATAAAAGC 


5760 


AGCATAGTAA 


ATCGGATGCT 


CACTTCGATA ACCAGGAGCC CCATTGTCTC 


GCCCACCTGC 


5820 


CTCTAAGCCA 


GCCTCATAAC 


AAGCCTGAAC 


TTCTTCCTTA 


TTTTCTGCTA 


AAAAAGCAAA 


5880 


ATGAACAGGA 


TCTTGTGTTC 


CCTGAGTCAG 


CCAAAAATCA 


CCACCAGGAT 


GAGGGCTGTT 


5940 


CGGGGATAGA 


AAACTAATTA GAGAACTAGT CTTAAAAGCC AATTTATAGT 


CCAAAGGAGC 


6000 


GAGAAAACTC 


CTATAAAATC 


CTTATGAAAT 


TTGTAAATCC 


TTTACGTTAA 


TCTCAAAATG 


6060 


ATCAATCATT 


CTCACTACCC 


ATAAATGCTT 


TCAAGCGTTC 


GACTGCTTCT 


TTAAGCGTGT 


6120 


CTAGGTCTGT 


CGCATAGCTG 


AGGCGGACAT 


TTTCTGGTGC 


TCCAAATCCA 


GCTCCTGTTA 


6180 


CCAAGGCCAC 


TTCGGCTTCT 


TCTAAGATAA CAGTTGTAAA GTCTGTCACA 


TCCGTGTAGC 


6240 


CTTTCATCTC 


CATGGCCTTT 


TTGACATTTG 


GGAAGAGATA GAAGGCCCCT 


TGCGGTTTGA 


6300 


CCACTTCAAA 


TCCTGGTACC 


TCTGCAAGGA GGGGATAGAT 


GGTATTAAGA 


CGTTGCTCAA 


6360 


AGGCCTGACG 


CATGCTTTCT 


ACAGTATCTT 


GCTGACGTGA 


TAGAGCCTCA 


ACTGCTGCAT 


6420 


ATTGGGCTAC 


TGCTGACGGA. 


TTCGAAGTTG 


TTTGACCTGC 


AATCTTGGAC 


ATGGCAGGGA 


6480 


TAATGTCTGC 


TTCTCCAACG 


GCATAACCAA 


TCCGCCAACC 


AGTCATGGCA 


TAAGTTTTAG 


6540 


ACACACCATT 


GATGACCACT 


GTTTGCTTGC 


GAATCGCTTC 


CGATAGGGTA 


GAAATGGGTG 


6600 


TGAACTCATG 


ACCATTATAA 


ACCAAGCGGC 


CATAGATATC 


GTCTGGTAGG 


ATGAGAATAT 


6660 


CATTTTCTAC 


AGCCCAGTTT 


CCAATTGCCA 


AGAGTTCCTC 


ACGGGTGTAA 


ATCATACCTG 


6720 


TGGGATTAGA 


TGGCGAATTC 


AGGACCAAAA 


CCTTGGTCTT 


GTCAGTGCGA 


GCTGCTTCTA 


6780 


ACTGCTCTAC 


GGTCACCTTA 


AAGTGATTGT 


CTTCGTTAGC 


AGAAAGAAAG 


ACGGGAACGC 


6840 


CTTCTGCCAT 


CTTGACCTGA 


TCTGCATAGC 


TAAGCCAGTA 


TGGGGTTGGG 


ATGATGACTT 


6900 


CATCACCTGG 


ATTGACCACA 


GCCATAAAGA 


AGGTATAGAG 


AGAATATTTG 


GCTCCCGCAG 


6960 


CGACTGTGAC 


TTGATTTGAC 


GCTACAGAAT 


AGCCGTAAAA 


GCGCTCAAAG 


TAGCTATTGA 


7020 


CCGCCGCCTT 


AAGCTCTGGC 


AGACCTGAGG 


TTACTGTATA AAAAGAAGGA 


CGCCCATGTC 


7080 


GAATCGATGC 


AATGGCGGCA 


TCTTGGATAT 


TTTTGGGAGT 


AGTGAAATCT 


GGCTCACCCA 


7140 


AGGTTAGAGA 


CAAAATATCT 


CTACCCTCAG 


CCTTCAGTGC 


TTTGGGACGG 


GCTCCAGCAG 


7200 


CCAAAGTCAC 


ACTTTCTTCC 


ATTTCTAAAA 


CACGGTTGGA 


TAGTTTCATA 


GGCCCTCCTT 


7260 


GTTGACCAAT 


GCTCCTGTTT 


CAAAATCTAC 


TAGATAAAAA 


TCAGATCCTG 


ACTTAACTTC 


7320 


CCAGATTGGC 


TTATCTTGAT 


AACGGCCAAA 


GGTTATGTTG 


TGAATGTGGC 


GAGGTGGCTT 


7380 


TTCCTTAGAA 


ACCGTTTCTG 


CTTTTTCTTG 


TGAAACACCC 


TGATTTAGCT 


GATAAAGGTA 


7440 
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AATCTTATGG TCATCTTTAC CAATCAGGAC AGCAAGCGCT TCTTGCTGTT TGTTACGACC 7500 

AAGAACGCTG TAATAAGATT CCAAGCCATT GTATAAATCA ACCTGATCAG CCTGCTCTAA 7560 

TCCTGCATAC TGCTGAGCTA ATTTTTCTCC TTCACTTTTA GCTGTTTGAT AGGGTTTCAT 7620 

GCTAAGAGAA ACCATATACA GAAAGGAACC ACTGATAACC ACAAACAAAA TCGTCATCCC 7680 

TAGACCATAC TGCCACAGTA GATTATTTTT TGCTTTGTTT TGTCTTTTTT TCACTCGTCT 7740 

ATTTTACCAT CTATTAAGCT TTATTACAAG TGAATATAAG AATACTCTTC GAAAATCTCT 7800 

TCAAACCACG TCAGCTTTAT CTGCAGACCT CAAAGCTGTG CTTTGAGCAA CCAATTCTAT 7860 

TTCTCCCTTC AAACAAAACC GATTTTGAAA GTGAAACAGT TCTTACTTTT TCAGTCACAA 7920 

ATGATTAGAG TTTGCCGGG 7939 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9897 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



CCGCTCTACC 


GTCAAATAAT 


TACCATTTTG 


TTTAATACCG 


AAATTTTTAT 


CTACTGAAAA 


60 


TTCAGTTGGT 


CTGTTGGTAC 


GATCGTCGTA 


TACAGTACCA 


TTCTCACGAA 


TAGTATAATT 


120 


GTAATCAGTA 


TCACCTTGTT 


TCCTTAATTT 


AAGGTAATAA 


TTACCATCAA 


TTTGTTTATA 


180 


ACCTGAATCT 


TTTCTAGTTG 


CTTCTCTAAA 


ACTTACTCCA 


GCAGGCATCA 


CATCAGCAAA 


240 


CATGAGTACT 


TGTTTGTTCT 


TTTTTTCAAC 


AATAACAGAG 


TCAATATAGG 


TTGCACCACC 


300 


GCTGATTTGT 


AAGTCACGTC 


CACCAACTTC 


ACGAGGCCAT 


TCTAATGGTA 


CTGGCGCAAA 


360 


ATCATCGAAT 


GCCAATGTTA 


ATTTTGGTTT 


AGTCCATGTC 


TTACCATTAT 


CATCACTATA 


420 


ACTTGTAGCA 


ATATTAATTT 


TATTCAAGAA ATCATGAGTT 


CCACCGTAAC 


GAGCGTCAAT 


480 


GCTTGAAAAT 


ACCCGACCAT 


TGCTAAAAGT 


ATACAGAACT 


GGAATACGGA 


AATAGTTAGA 


540 


ACCTGTTGTA 


TCATTAGCCG 


TATAAATTAA 


ATGTCCAGTA 


ACAGCGTTTG 


TTGTCATCTT 


600 


TTTAACAGTT 


TCTTCATCCA 


ATGCACTATT 


AAAGAATTTG 


ATATTTTCTA 


GTGTTCCGTT 


660 


AAAACCAAAC 


GCCGTTTTTC 


CTGCACGTTT 


CACTCCCCCA 


AGCATATAGT 


AATCAATACC 


720 


TTTAATATCC 


TTGATGTTTA 


GGAAATTATC 


CACTTTCTTT 


TCTACTACTT 


TTGTACCATT 


780 


TGCGTATAAA 


GAATATGTTT 


TTTTGACTGA ATCTGCTACT ACTGCAACAG TGTTAGTCAC 


840 


AGCCTCTTGT 


TTGTACTTAC 


CCCAAACTGA 


AGCAGGTCTG 


GATACTAGGT 


TATTTTTATT 


900 
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GGAAGAAGTA 


TCACGCGCTT 


CCATCCCCAA 


CTCACCATTG 


TCTCTAAGGA 


ACACATCTAC 


960 


ATAACTATTT 


TGTTGACCGG 


GTTTGGAATT 


AGATATTCCA 


AACAGAGCTT 


GTAAGCGTTT 


1020 


CTCACTTGAC 


TGATTGTACT 


TAATCACTAC 


AGTAAAGTCA 


CCGCTAGTAA 


ATTTATCCTT 


1080 


TAACTCTTTA 


GTAACATTTT 


CTCCGCCCCC 


TGTTAAAGTA 


ACATTATTTT 


TTTCTAAGAC 


1140 


AGGAGTTTCT 


TCCGCTGTAG 


AAGATGGATC 


CTTAACAGTA 


GTTTCAACTG 


TTCGAGGTTG 


1200 


TACAGTAACT 


TCCGAAGAGT 


TATCCGATGT 


AGGTTGTACT 


TCCGAAATCG 


GAGTCGTTGG 


1260 


TGCAACAGGT 


TGCACCAACT 


TTGGTGTTGA 


TACTTCAGAA 


GTTTCAGTCT 


CCTGAGCTGC 


1320 


AACTGAGTTA 


GCAACAAATG 


CTGATAATAC 


CACTACAGTA 


CCTAAGGTTA 


CATATTGTTT 


1380 


AATATTTTTT 


TTCATTTTAT 


TTTTCCTCGT 


TTAAAACTTT 


GATAACAAGT 


TTTTTAACAG 


1440 


TTTCATCATT 


GCAATGAATC 


TTTGGTTGGT 


GAAGATCTTC 


TTCAAAAGTC 


ACCAACATAT 


1500 


TCCCTGGAAG 


CAATTCAACA 


ATTTGATAGT 


CTTTGCTATC 


GTAAAAAGCA 


ATATCCTTCT 


1560 


CTTCGCTAAA 


AGGTACACGT 


GACTGGGCAC 


GAACTGGGGA 


AGTTACTGCC 


ATTTTTTCAG 


1620 


TATTTTCAAC 


AACAATATGA 


ATATCTAAAT 


ATTTCTTATG 


AGTTTCAAAA 


ATATCTCCTG 


1660 


GAACTCCATC 


AGCTAGATAA 


GTCATACAAT 


TTGCAAAAAC 


ATTTTCCCCG 


TCAATATCAA 


1740 


TTTTTCCATC 


AACTAAATCT 


GTCAAATTTG 


TATTTTCTAA 


AAAATCACAG 


ACTTTTGAAA 


1800 


AATATTTATT 


GACAGAAGCA 


TATCGTTTAA 


AATCAGATTG 


TTCAGAAATA 


ATCATATTAT 


1860 


TTTCTCTTTT 


CTATTAGTGA 


CGAACTTCCC 


AACTTGAATC 


CGCTTTAATT 


TCTGTAATAT 


1920 


CATGAATCGT 


TGTATATTTA 


GGTGCAGATA 


CTTTATTTCC 


AGTAAGAACA 


GATACAATAT 


1980 


AACCTGAAAC 


TACTGATACA 


GAGATTGAAA 


TCAATGAATA 


TGCCCAGTAG 


CTAACAGCTG 


2040 


TTGGAGGAAG 


GAAGTATTTA 


ATAAATACCA 


TGACGATGGT 


TGATACAATC 


AGGGCTGCAT 


2100 


AAGCACCTTG 


TTTATTTGCT 


TTTTTAGAAA 


CAAATCCAAG 


AATAAATACA 


CCACCAAGTA 


2160 


GACCAAGTAC 


AAGTCCCATG 


AAACTATTGA 


ACCATTCGTA 


TGCAGATTTA 


ATATCTGAGT 


2220 


GAGCCATGAC 


AATGGAAACA 


CCAATTGAGA 


ATAAACCTAG 


TGCTAGAGAT 


ACGAATTGTG 


2280 


CAATTTTCGT 


ACGACGATTG 


TCTGACATAT 


TTTTAGAAAT 


GACATCTTGA 


ATATCCAATG 


2340 


TCCATGAAGT 


TGCAACAGAG 


TTCAAACCTG 


TTGAAATAGT 


TGATTGAGAT 


GCTGCATAAA 


2400 


TCGCTGCCAA 


GATCAAACCT 


GTGATACCTA CTGGTAACTG GTATGCAATA 


AAGTACATAA 


2460 


AGATTTGGTC 


TTGAGGGATA 


TTGCTAGCTG 


CACTATCTGC 


ATTTTGTACT 


TGATAGAATA 


2520 


CGTACAAGCC 


TGTACCAATC 


AAGTAAAAGA 


CTGTTGCAGT 


TGCAAGTGAG 


AAAACACGGT 


2580 


TTGTGAACAA 


CATCTTATTA 


AGTTTCTTAA TATTTTGTGT 


TGTAGTAAAA 


CGTTCAACGA 


2640 
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AATCTTGAGA TGAAGCATAG GAAGACAAGA TTGTAAAGCC TGAACCCATC ACAATTAAAA 2700 

AGATGGAGTT TGAAAGCAAG TTAGGATCGA AAAGTTTTTC ATTTGCAGCA AGGAATTTCC 2760 

CGTTTGCTAA TGTTTCTGCT ACTGCACCAA AGCCACCTTT AATATTAGCA ATCAGTACAA 2820 

ATAAAGCTAA AACGACACCA CTAATCAGAA TCACACCTTG AATAAAGTCT GTCCATAATA 2880 

CGGATTTTAG ACCACCAGTA TAAGAATAAA CAATTGCAAC TACACCCATC AAAATAATCA 2940 

AAATATTGAT GTCAATTCCT GTCAATACTG ATAAACCAGC TGATGGGAGG TACATAATGA 3000 

TAGACATACG TCCCAATTGA TAAATAATAA ACAAGAGTGC TGAAATAATA CGAAGTGCTT 3060 

TAGAATTAAA ACGTTTATCC AAGTAATCAT ATGCCGTATC GATGTCTATC CGTGGAAAGA 3120 

TAGGTAAGAT AAAACGAATT GTCAGTGGAA TAGCTACTAC CATCCCTAAT TGAGCAAACC 3180 

ATAAAATCCA GCTACCTGCA TAAGAGCTAC CAGCGAGTCC CAAGAAGGAA ATCGGACTGA 3240 

GCATTGTGGC AAAAATGGAT ACCGAAGTAA CATACCAAGG AACCGAACCA TCTCCTTTAA 3300 

AGAACTCTTT TCCTTTCATC TCTTTTTTAG AGAAATAGAT ACCTGCAACC AACACCGCAA 3360 

GTAAATAAAC AATCAAGATA ATTAAGTCAA TTATTGTAAA TCCTGTTGTG CCCATAACAT 3420 

ATCTCCATAT TGATTTTATT TATTATAAAA ATTCTTTTCG TGCTTGTTGA ATAAGTTCTG 3480 

CTGCTTGTTT TGCAACTTCC AAGTCACCTT CTGCCAATGC TTCTAAAGGT TGACGAACAG 3540 

AACCTAAATC AAGTTTTTCA TTTAGACGCA AAACTTCTTT TGCTACAGCA TACATATTTG 3600 

CCTTACCTGA TATCATCTTA TAGATAACTT GATTGATAGC ATATTGAAGT TTTTTAGCTG 3660 

TATCTAAATC TCGTTCTTGA ATCAAACTTT CCAATTTCAA GAACAAATCT GGCATAACGC 3720 

CATAAGTACC ACCAATACCA GCTTCTGCTC CCATCAAGCG ACCACCAAGA TATTGTTCAT 3780 

CTGGACCATT GAATACAATG TAATCTTCTC CACCTGCAGC TACAAACATT TGAATATCTT 3840 

GTACAGGCAT AGAAGAATTT TTAACTCCAA TCACACGAGG ATTTTGACGC ATTGTTGCAT 3900 

ACAAACTACC AGTCAACGCA ACCCCTGCCA ATTGTGGAAT ATTATAGATA ATAAAATCTG 3960 

TATTTGACGC AGCTTCACTC ATTGCATTCC AATATGCTGC GATTGAATAC TCTGGCAATT 4020 

TGAAATAAAT AGGTGGGATA GCTGCAATAG CATCGACTCC AACACTTTCT GAATGTTTTG 4080 

CCAATTCGAT ACTATCTTTC GTGTTATTAC ATGCAATATG GTTGATAACT GTTAATTTAC 4140 

CTTTAGCAAC TTCCATAACA GCTTCAATAA TTTGTTTACG ATCTTCTACA CTTTGGTAAA 4200 

TACATTCACC TGAAGAACCA TTTACATAGA TACCTTTTAC ACCTTTGTCA ATGAAATATT 4260 

GTACCAGAGA TTTTACACGA TCTTGGCTAA TTTCACCATT TTCATCATAG CAAGCATAAA 4320 

ATGCAGGGAT AACGCCTTTG TATTTAGTTA AATCTTTCAT CAGATTTCTC CTTTATATTG 4380 

TTTTTTATTT GATGACATTA ATAAATCGCT GAGCAATTTC TTTTGGACGT GTAATCGCTC 4440 
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CACCAATGAC TACACTGGTA 


ACACCTAAAC 


TATAAGCTTT TTTTAATTGT 


TCTGGATAAT 


4500 


GAATTTTTCt TCGGCAATTA 


CCGGAATATT 


AAAATCAGCC AATTTTTTCA 


TTAGTTCAAA 


4560 


ATCAGGCTCA TCTGATTGTA CACTTGTACT TGTGTAACCT GATAATGTTG 


TACCAACAAA 


4620 


ATCAACGCCT GATTTAAATG 


CATAGAGACC 


TTCATCTAAA TTACTTACAT 


CCGCCATCAG 


4680 


CAATTGATTC GGATATTTTT 


CTTTTATTTT 


TTTGATAAAT TCACTGACAA 


CTAAGCCATC 


4740 


ATATCTTGGT CTTAAAGTTG 


CATCAAATGC 


AATGACTGTT GTTCCGCATT 


CTACAAGTTC 


4800 


ATCTACTTCT TTCATCGTAG 


CAGTAATATA TGGTTCTTGA GGTGGATAAT 


CCCTTTTGAT 


4860 


AATTGCAATT ATTGGTAAAT 


CTACTACTTT 


CTGAATTGCT TTAATATCAC 


GCACAGAATT 


4920 


TGCGCGAATG CCCACTGCTC 


CTGCCTCTAA AGCTGCTTTA GCCATAAAAG 


GCATCAAGCT 


4980 


AAATTCTTCA TTATAAAGGG CTTCACCAGG TAAAGCTTGA CAAGAAACAA TGACTCCACC 


5040 


TTGAACTTGG CTTATAAATT 


TTTCTTTAGT 


CCAAATTTGG CTCATTTTAT 


TATTCCTCCT 


5100 


TATGGATAAT AGTTTGATTG 


TAATAATATT 


GTCTCTCTGG ACTTTCCAGA 


TAATTAGAGA 


5160 


ATAAGCAGTC TGTAATTAAA 


AGTATTGGAA 


ACTGAGGTGA TATGCGATTG 


CCATACGAGA 


5220 


GATGATCGGT CGAAGCTAAT 


AACAATAGTT 


CATCAAAGAA ACAATCTTCT 


TCGTCAAATT 


5280 


TTCTTGTAGT CATTAAAACT 


GTTTTAGCGC 


CTTTATCTGC AGCTTTTTGT 


AGACCTTCTA 


5340 


GTACAATATC AGTTTGACCT 


GAAATGGATG 


CTCCAATGAC AAGGCAATTT 


TCATTAAGTA 


5400 


GTAAGCTACT CCACAAAATC 


ATATCCTCGT 


CTGATAATAC TTCACCAATC 


ACTCCGAGAC 


5460 


GCATAAATCT CATCTTCATT 


TCTTGTAAAG 


CAAGAACAGA ACTTCCTTTA CCGTAGAGAT 


5520 


ATACACGCTC AGCAGTTTCT 


ATCATCTCAG 


CAATACGCTC AAGTTGAACT 


TCATCAAGAA 


5580 


CCGTGTAAGT TTTTCTCAAC 


ATTTCCTCAT 


AGTCGGATAA AACTTTTTCT 


GTTGCCTCTG 


5640 


TATATAATGC CAACTTTTCT 


TTCTCATGAA TCATCTCTTG GTATTTGAAA ATGAATTGTC 


5700 


TAAAACCTTT AAAACCACAT 


TTTTTCGCAA 


ATCGAGTCAA TGTTGCTTTG 


GATACATTAA 


5760 


GGTATTCGCA CAATGCTTTA GATGAATAAT CATTCAGAGG TTGCTGTTTT AAGAAGAATT 


5820 


TAGCAATGTC TTTTTCAGCA TATGCCATAT TTGGTAAGTT AGCTTCTATC 


ATTGGAATTA 


5880 


GTTCTTTTTG CAGTAACATA 


TGAGCTCCTT 


AGTTGAAGTA AACGTTTACA 


TTCTTTATTT 


5940 


TAACACTTTT TTTTTTTTTC 


AATATTTTTC 


ATAAATTAGA AACTAGTTTC 


CAATTTCTTT 


6000 


CGTTTCATAA CAGAACAACA 


AACATAAAAA 


TATAATAGTT TTTATTCTTT 


TTATCGTAAT 


6060 


TATATGTATT GTAAGAACGT TTATCACTAA TAATATGTTC ATATTAAAAT ATTTTAGTAA . 


6120 


TATTTTATTT TGGTTTTATT 


ATTTCTTTTC 


GGAATTTCTA TATAATATTT 


TATTTCTAAA 


6180 
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AAAATTGAAA AAATATTTCT AGTTTCTTTA TTTTATATAG GTAATATATT TTATTTCTAA 




ATTAAAAGAG 


AATCCCATAA 


AAACTACAGA TTTATGAGAT AAATCAGGTC ACCTATTTTA 


DJUU 


AAAAAGCAGC 


AAACTATAAA 


CTAAAAAGTT CCACACCAAA TGTAACCCCA TACTTCCCCA 


OJDU 


TAAGTCAGAT 


TTATAGCGCA 


CCATACCTAA AAACATTCCA AGTGAAACGT ACAGACACCA - 


04zU 


AGCTAGAATG 


GTTCCTGGAT 


GATGTACTAA GGCAAATAAA ACACTTGTCA AAGCAACTCG 


6480 


AATATCTAAT 


TTTCTAACCA 


AGTTCCATAA AATTTCACGA TACAGAAATT CTTCAACCAT 


6540 


ACTCGCATTG 


ATTAAGAACA 


ATAAAAATGA AAACCAAGGA ACTTGATGTT GAAGGCCAAT 


6600 


TAAATTTGTT 


TGATTCGTGC 


TTCCTTGAGC ATGAATCAGG CTAAAACATA GACTTATAAT 


6660 


CAGTAGACTA 


GCTAGTCCAA 


TACCAAGGCA TTTCATCCTA GTTTTCATAT TGACCTTGAC 


6720 


CACTTGTTTT 


CGTTGACCAT 


ACATCCATAA AAAAGAAAAA AGAGACGCAC CATAGAGAAC 


6780 


CTGTAGTATA 


GTTAACTCAC 


CGATACAAAG AAATTTCAAT AAGTATAGAG ATACCAATAG 


6840 


GACATTTACT 


TGTTGGAATA 


TATAAACTGG AATTATTCTT TTCATAGTTA CCTCCGAAAT 


6900 


AAATCTTCAT 


AATCTAAATC 


TAATATCTGC ACAATCCTTT CTACCCATGG ACTTTGAGGC 


6960 


ATTCGTTGTT 


CCATCTTGTA 


GTGGCGAATC TTTTGATATA AACGATTCAA TTCACTTGGA 


7020 


TAGTGAAACT 


CTCCCGCAAA 


CATTTTTCTG GTTAACTCAA TCCAGCTGAT ATTTCTTTCA 


7080 


GCCAAAATAA 


TGGACAAGTT 


CTCCCAAAAT CGTTCAGCCA TATTrCTTCT CCTTTAGTTA 


7140 


GATAAATAAT 


GTGTTTGyGC 


CATGTAAATC AATTGTTTCG TATCTCTTGG CAATAGAGCT 


7200 


CTAGCCTCTT 


CCAAATTCAG 


ACTTGGATAA ACCCGCTTAT TTGAAACCAC AAAAGGAAGT 


7260 


CCGATGGTTA 


GTTCAGGATT 


TTTTAAAATT ATCTCAACGA AATCCGTTAA TCTTAGATTG 


7320 


TCACGGTTCT 


TAAATCGTAA 


TAAATTGGGA GATAAAAACT CAAAACAATC TGAAGAATAG 


7380 


CTCATCATCT 


CAATTAATTT 


GTCCTTTGTC ATTTCAGAAA CTGAATGACA AGATACCTCA 


7440 


ATGCCATAGT 


TTTGGAAGAA 


GTCTAAAAGA AGTTGATTTC TTTGGCTATT TTTACTTAGA 


7500 


TAGAGATCAA 


TCATGGGAGA 


CCTCCAACAA ATTTGCTTCC ATTTGATATT CTGAGACGAT 


7560 


TAAGGAATCT 


AACAACTTTG 


AGAAGTTAAT CGATTTCTTG TCTTCATCAT AAGCTTTTAC 


7620 


AGTTACTTGG 


GTTGTAAGTA 


TCCCCTCTTT TCCCTCGGCT CGATAGTCTT GTCAATATAA 


7680 


AACAAAAACA AGATTCTGAT 


TATCATCTAC AAAGGCATTA ACTCCGTTCT TTATATCCTG 


7740 


ACTTTCAAGG 


AATTCCATAA 


CGTTTTGAAG ATAGGATTCA TAAAATAGTG GGTAATTATG 


7800 


TTTTTTATGG 


TAATCATCTA 


AAAATGTTAC CTCAAACTCA CATGGATAAT TGGGCATCAA 


7860 


AAATATTTGT 


TCATCCAGCT 


GTTTGATTTC TGCATCATGT AATTCTGTTT CTAATTCATC 


7920 


ACAATCTAGT 


ATTGATTCTT 


TATTTAATGC TTTTATCTTT TTCCTCTATT TCTTTTAATT 


7980 
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TCTTTGCGAT TGCGGCAATC 


ACAGGAACGG TTACACTATT 


ACCAACTTGT TTATAGAGCT 


8040 


GACTATTAAT AGAGACTTTT 


CTAGCAGCTT 


CAAAAGCCTA 


ATCAGGAAAG 


CCATGCAATC 


8100 


GAAAACACTC TTTAGGAGTG 


ATTCGTCGTA 


TTCTCAAACG 


GTAAAATTGT 


CCATCTATTA 


• 8160 


AAACACCAGC TACTTGGTAA 


ACTTGTTTAT 


CTTCTCCTTC 


ATAGCTAGCC 


ACTACTACTC 


8220 


CCATTTGACC ACTAGTTGTT 


AACGTATTAG 


CTATACGTTT 


TCCAACTCTA 


CCACGACGAT 


8280 


ACTGAGAACT TGGTCTTTCT 


AAATTGATTG 


AATCCGCAAT 


CTCTGCTTGA 


GCATATCCTT 


8340 


TTTTCGTTGC TTCCCGTACT 


TTTAGAAATT 


GGATTGGTTC 


TGGAATTAGT 


ATTTTGGGGA 


8400 


TTTTATCTCC TCCTTGCATC 


GTAGTCAGTG 


TTGGAGATAA 


GCCCTCACTT 


CCATAGACAC 


8460 


GACCTGTCTC CTTAAAGCTA 


GTCGGTAAAT 


CTCCAACAAC 


GACAATGCCA 


TAACGATCCT 


8520 


GAGTATTTAA AGTAAACATC 


GGCTCTTGAT 


TTTCCTTAAA 


GCGTCTCCCA 


TTTTGTCTCT 


8580 


TGTCTAATCT ATCTGGTGTC 


ATACAAGGAA 


TCGCAACTTT 


AAATCCTTCT 


CCTTTACCAC 


8640 


GAACTAAGGT TGGCGCAAGA CCTTCTGAAT AATAGACTTT ACCGCTCATT CCACTTCTTG 


8700 


ATGGATTCAA ATTTCCTAGT 


GCTTTCAAAG 


TCTCAGAGTT 


AGTTGCTTGA 


CCTTCTCGTG 


8760 


TGAAAGGAAA TAAGAGTCTG 


GTACCTTTCT 


TTCTAGAATG 


TCCGATAATA 


AACACCCTCT 


8820 


CTCTGTTTTT GGGAACGCCA 


AAATCCTTAC 


TGTTAAGCAC 


CTGCGACTCA ACATCAAACC 


8880 


CCAACTCATC AAGTGTGGTA 


AGTATTGTGG 


TGAACGTCCG 


TCCCTTATCG 


TGATTGAGTA 


8940 


GGCCTTTAAC ATTTTCAAGA 


AAAAGAAAAC 


GTGGTTGGAT 


TTGTTTGGGG 


GCCCGAGCAA 


9000 


TTTCAAAGAA CAAAGTTCCT 


CTAGTATCTT 


CAAATCCCAA 


TCGTCTTCCT 


GCGATTGAAA 


9060 


ATGCTTGACA AGGGAATCCC 


CCACAGATGA 


CATCGACTTT 


CCCTCTAAGT 


TTTTTAAATT 


9120 


CGTCATCTGA AACATCTCGT ATGTCATGAA ATTCTATTTC TCCTTCCGTT TGAAAAATGG 


9180 


ACTTATAAGA TTTCCTAGCA AATTTATCAA TCTCACAAAA TCCCAAGCAC TCATGCCCTT 


9240 


GAGCTTCCAT TCCCATCCTA 


AAGCCTCCTA 


TCCCAGCAAA 


TAAATCTAAA 


ACCCAAATCA 


9300 


TTCATACCTC TCTCAACTAG 


ATGTAACTTA 


CAAAACCCCT 


GACCTCATGA 


GCCACTTTCT 


9360 


TCCTCCTCAT GAGGTCAGTT 


TTACTTTCTG 


CTGTTCCAGT ATCGTTTTTC 


CTCGCTAGAT 


9420 


TTCCTCAAAA GGGCAGACTC 


CTCCCTTGGT 


TCGTCACACG 


ATTTTTTCAT 


CTCGACTGTT 


9480 


CTTTAATGCA TCATTAACGA 


CGCTTTTCTT 


CTAGGTGGTT 


CATAAGGAAC 


AGGAAGATTC 


9540 


AGGTTGACTT TTCTAATCCT 


AGAATAAAGT 


GCTGAAAACA 


ATTCGGAATA 


GGCATAGAGA 


9600 


CTAGACAATT TGAGGAGCTG 


CTTGCGTCGT 


GTTCGAACAC 


ATTTTCCTAC 


CACGTGAAGA 


9660 


AAAAGATGGC GGAAGCGTTT 


GATTGTTAAA 


GTTTGGAAGT 


CACCTCCAGG 


TAGATGTTTG 


9720 
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AGAAAAAGAT AGAGATTGTA GGCGATACAG CTCATCATCA TACGAACTCG TTTTTGATTA 9780 
AGGTTGAACT ATCCGTTTTA TCGCCAAAAA ATCCCTCCTT CATCTCCTTG ATGAAATTCT 9840 
CGGCTTGACC ACGTCCACGA TAAAGCTGAA ACTGGTCTTG GCTTGTTCCG GTACCGA 9897 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 8148 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



240 
300 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CCGTGGAACA AGCCAAGACC AGTTTCAGCT TTATCGTGGA CGTGGTCAAG CCGAGAATTT 60 

CATCAAGGAG ATGAAGGAGG GATTTTTTGG CGATAAAACG GATAGTTCAA CCTTAATCAA 120 

AAACGAAGTT CGTATGATGA TGAGCTGTAT CGCCTACAAT CTCTATCTTT TTCTCAAACA 180 
TCTAGCTGGA GGTGACTTCC AAACTTTAAC AATCAAACGC TTCCGCCATC TTTTTCTTCA 
CGTGGTAGGA AAATGTGTTC GAACAGGACG CAAGCAGCTC CTCAAATTGT CTAGTCTCTA 

TGCCTATTCC GAATTGTTTT CAGCACTTTA TTCTAGGATT AGAAAAGTCA ACCTGAATCT 360 

TCCTGTTCCT TATGAACCAC CTAGAAGAAA AGCGTCGTTA ATGATGCATT AAAGAACAGT 420 

CGAGATGAAA AAATCGTGTG ACGAACCAAG GGAGGAGTCT GCCCTTTTGA GGAAATCTAG 4 80 

CGAGGAAAAA CGATACTGGA ACAGCAGAAA GTAAAACTGA CCTCATGAGG AGGAAGAAAG 540 

TGGCTCATGA GGTCAGGGGT TTTGTAAGTT ACATCTAGTT GAGAGAGGTA TGAATGATTT 600 

GGGTAAATAC AATGAGCTTG AAAGAAGTAG CAAACTCACC AAGCGCCAAT TCTTTGAGAA 660 

TCAGATGCTG GATTATACCA TCATTGCGCA TGAGAGTTTT GAAATCATCC GTCATTCTGT 720 

CTACCAGACA GATGATCGTG AAGTGGAAAA TGCTCTGGCT TTTGAAGTGA AAAATGATGA 780 

AACAGACAAG CTGATTCTGT TATTAAGCGA GGATATTGGT GTAGGTGAAA AATTGTGCCT 840 

CGTTGACGGA ACAAAAATGC GTGGAAAATG TTTAGTATAT GATAAAATAA ATGAGAGAAT 900 

GATTCGCTTG CAGTGCTAGA AATAGGCATT TTGAATAGTG AATATGTTAT AATAAGTATT 960 

AGTAGGAGGT GTTTTAGATT GGAGAAGAAA CTGACCATAA AAGACATTGC GGAAATGGCT 1020 

CAGACCTCGA AAACAACCGT GTCATTTTAC CTAAACGGGA AATATGAAAA AATGTCCCAA 1080 

GAGACACGTG AAAAGATTGA AAAAGTTATT CATGAAACAA ATTACAAACC GAGCATTGTT 1140 

GCGCGTAGCT TAAACTCCAA ACGAACAAAA TTAATCGGTG TTTTGATTGG TGATATTACC 1200 

AACAGTTTCT CAAACCAAAT TGTTAAGGGA ATTGAGGATA TCGCCAGCCA GAATGGCTAC 1260 
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CAGGTAATGA 


TAGGAAATAG 


TAATTACAGC 


CAAGAGAGTG 


AGGACCGGTA 


TATTGAAAGC 


1320 


ATGCTTCTCT 


TGGGAGTAGA 


CGGCTTTATT 


ATTCAGCCGA 


CCTCTAATTT 


CCGAAAATAT 


1380 


TCTCGTATCA 


TCGATGAGAA 


AAAGAAGAAA 


ATGGTCTTTT 


TTGATAGTCA 


GCTCTATGAA 


1440 


CACCGGACTA 


GCTGGGTTAA 


AACCAATAAC 


TATGATGCCG 


TTTATGACAT 


GACCCAGTCC 


1500 


TGTATCGAAA 


AAGGTTATGA 


ACATTTTCTC 


TTGATTACAG 


CGGATACGAG 


TCGTTTGAGT 


1560 


ACTCGGATTG 


AGCGGGCAAG 


TGGTTTTGTG 


GATGCTTTAA 


CAGATGGTAA 


TATGCGTCAC 


1620 


GCCAGTCTAA 


CCATTGAAGA 


TAAGCATACG 


AATTTGGAAC 


AAATTAAGGA 


ATTTTTACAA 


16S0 


AAAGAAATCG 


ATCCCGATGA 


AAAAACTCTG 


GTATTTATCC 


CTAACTGTTG 


GGCCCTACCT 


1740 


CTAGTCTTTA 


CCGTTATCAA 


AGAGTTGAAT 


TATAACTTGC 


CACAAGTTGG 


GTTGATTGGT 


1800 


TTTGACAATA 


CGGAGTGGAC 


TTGCTTTTCT 


TCTCCAAGTG 


TTTCGACGCT 


GGTTCAGCCC 


I860 


TCCTTTGAGG 


AAGGAGAACA 


GGCTACAAAG 


ATTTTGATTG 


•ACCAGATTGA 


AGGTCGCAAT 


1920 


CAAGAAGAAA GGCAACAAGT 


CTTGGATTGT 


AGTGTGAATT 


GGAAAGAGTC 


GACTTTCTAA 


1980 


AATGAAGGAA 


AATGACTTGC 


AATCTCTGTT 


AAGAAATAAA 


ATAATCCCAC 


CTAGAACAAG 


2040 


CTAGGTGGGA TTATTTGCGT ATGAAATGAG AAATTATGGG AGCAAGCTCC 


TAAATCAACT 


2100 


GTTTTTGATC 


TACTTCTTTA 


ACTACTTGAT 


AAAAGTTATA 


GAAGTAGGCC 


AAACTTGAAA 


2160 


TGATGGTTAC 


GACTAGGAAT 


ATTGAAAATT 


TCCATTGGAC 


AGGGTTGGTT 


AAAAGTTGTG 


2220 


GAAAGGATAT GAGGAGAAAG AAGAGGGCTG 


CGTTGAGGAC 


AGGTATGCGT 


TTTGATTGTA 


2280 


TTTTCTCAAG 


TCCTTTATTG 


AGCGCAGGAA 


GAAAGAGGAG 


TAGGAGTAGT 


AAAACTGTAT 


2340 


GAGAAATAGC 


TCCTGAAGTA 


AGGGCGAAGA 


AAAGGAAAAT 


ACTGATAAAA 


ACATGAATGA 


2400 


TCAGTAGTCT 


AGCTAGTGAT 


TTCATAAGGC 


ACCTCCTAAT 


GCTGGTCTTT 


TTTAGGTGTT 


2460 


GCAATACGAA 


GTGAGTCGAC 


AATATGTATC 


ATCACTCCGA 


AAAAGAAAGC 


TCCCAGTATA 


2520 


GTTTTAAAAA 


TATGTTTTGT 


ATTTAGAAGA 


GAACTGATAA 


AATTTGGATT 


TTCACTTGTT 


2580 


AGGGTATCAA 


TGAGTGGAAT 


TATAAAAAAT 


ATCACTGTTG 


CATAAATCGA 


ACCTGCTTTC 


2640 


AGACCAGGAT 


AACGTAACTG 


TTTCTTTTCT 


TTTTTCATGA 


GTTTCCTCCT 


AATCCTCATC 


2700 


TTGATTTTTC 


TTAGTTTTTG 


CAATGCGACG 


GGAGATGAGG 


AACTGTATGC 


TCGCTCCGAA 


2760 


GAAAATAGAA 


CCGAGAATAC 


TTGATACACC 


ATTTCTTATA 


GTGAGAAGAG 


AATGAAAATA 


2820 


GTCCTGACCT 


TCATCTATGA 


GTATCCTGAG 


AAGAGGAGTT 


ATAAAAAACA 


TCCATAGAGG 


2880 


AAAGAACAAA 


GCTGCTTTGA 


GACCTGGGTA 


GTGTAGTTGG 


TTGCTTTCTT 


TGTGATTCAG 


2940 


CATATCTGGT TCAATGACTG 


TGATGCCTGT 


TTTTTTCATT 


TGGTAGGTGA 


CATAGCCAGA 


300Q 
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AGCGATGAGG GCAATCACTA AAATCAGAGG AGGATAGATT AGAGCCACTT CTTGAGGGTA 3060 

TTTATAGGCC AGAAGGAGTG GAATAAGATT TCCGAAAATC ATGAGATAAA AGAGGATGAT 3120 

AAAGACTTGG TTCCCAATAC TATCGGCCTC ACGCCGTTTG TATTCGTCAA GGGGACCAGA 3180 

AATACCGTAT GTGCGTTTGA TCAGTTTTTC AGTGAAGGTT TCTTTTTTCA TGAGTTTGCT 3240 

CCTTTTTTAA AAATCTTCCT CCCAAAAGAG ACTGTTGAGG TCAGTTTGGA GGCTGCGGGC 3300 

GAGATTGAGA CAGAGTTCCA AGGTTGGATT GTACTTGTCG TTTTCAATCA TATTGATAGT 3360 

CTGTCTCGAG ACACCGATAT CCTTGGCGAG TTCGAGCTGG GAAATACCCA ATTCCTTGCG 3420 

AAATTCTTTC ACACGATTCA TCTGTTCTCC TTTCTGATTT ATGTCGTATA TATTTGACTA 3480 

TATTATAGTC TTTTAAACAT AAAGTGTCAA GTATTTTTGA CATATTTTTT GAAGAAATAG 3540 

TAGTCTCCTT GTCCTATTTG TCTGACAAGT GCAAGCTGGT CGGATTTGTG GTAAAATAGA 3600 

TAAGATATGA CAAAAGAATT TCATCATGTA ACGGTCTTAC TCCACGAAAC GATTGATATG 3660 

CTTGACGTAA AGCCTGATGG TATCTACGTT GATGCGACTT TGGGCGGAGC AGGACATAGC 3720 

GAGTATTTAT TAAGTAAATT AAGTGAAAAA GGCCATCTCT ATGCCTTTGA CCAGGATCAG 3780 

AATGCCATTG ACAATGCGCA AAAACGCTTG GCACCTTACA TTGAGAAGGG AATGGTGACC 3840 

TTTATCAAGG ACAACTTCCG TCATTTACAG GCATGTTTGC GCGAAGCTGG TGTTCAGGAA 3900 

ATTGATGGAA TTTGTTATGA CTTGGGAGTG TCTAGTCCTC AATTAGACCA GCGTGAGCGT 3960 

GGTTTTTCTT ATAAAAAGGA TGCGCCACTG GACATGCGGA TGAATCAGGA TGCTAGCCTG 4020 

ACAGCCTATG AAGTGGTGAA CAATTATGAC TATCATGACT TGGTTCGTAT TTTCTTCAAG 4080 

TATGGAGAGG ACAAATTCTC TAAACAGATT GCGCGTAAGA TTGAGCAAGC GCGTGAAGTG 4140 

AAGCCGATTG AGACAACGAC TGAGTTAGCA GAGATTATCA AGTTGGTCAA ACCTGCCAAG 4200 

GAACTCAAGA AGAAGGGGCA TCCTGCTAAG CAGATTTTCC AGGCTATTCG AATTGAAGTC 4260 

AATGATGAAC TGGGAGCGGC AGATGAGTCC ATCCAGCAGG CTATGGATAT GTTGGCTCTG 4320 

GATGGTAGAA TTTCAGTGAT TACCTTTCAT TCCTTAGAAG ACCGCTTGAC CAAGCAATT'G 4380 

TTCAAGGAAG CTTCAACAGT TGAAGTTCCA AAAGGCTTGC CTTTCATCCC AGATGATCTC 4440 

AAGCCCAAGA TGGAATTGGT GTCCCGTAAG CCAATCTTGC CAAGTGCGGA AGAGTTAGAA 4500 

GCCAATAACC GCTCGCACTC AGCCAAGTTG CGCGTGGTCA GAAAAATTCA CAAGTAAGAG 4560 

GGAAAAAGAT GGCAGAAAAA ATGGAAAAAA CAGGTCAAAT ACTACAGATG CAACTTAAAC 4620 

GGTTTTCGCG TGTGGAAAAA GCTTTTTACT TTTCCATTGC TGTAACCACT CTTATTGTAG 4680 

CCATTAGTAT TATTTTTATG CAGACCAAGC TCTTGCAAGT GCAGAATGAT TTGACAAAAA 4740 

TCAATGCGCA GATAGAGGAA AAGAAGACCG AATTGGACGA TGCCAAGCAA GAGGTCAATG 4800 
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AACTATTACG 


TGCAGAACGT 


TTGAAAGAAA 


TTGCCAATTC 


ACACGATTTG 


CAATTAAACA 


4860 


ATGAAAATAT 


TAGAATAGCG 


GAGTAAGATA 


TGAAGTGGAC 


AAAAAGAGTA 


ATCCGTTATG 


4920 


CGACCAAAAA 


TCGGAAATCG 


CCGGCTGAAA 


ACAGACGCAG 


AGTTGGAAAA 


AGTCTGAGTT 


49B0 


TATTATCTGT 


CTTTGTTTTT 


GCCATTTTTT 


TAGTCAATTT 


TGCGGTCATT 


ATTGGGACAG 


5040 


GCACTCGCTT 


TGGAACAGAT 


TTAGCGAAGG 


AAGCTAAGAA 


GGTTCATCAA 


ACCACCCGTA 


5100 


CAGTTCCTGC 


CAAACGTGGG 


ACTATTTATG 


ACCGAAATGG 


AGTCCCGATT 


GCTGAGGATG 


5160 


CAACCTCCTA 


TAATGTCTAT 


GCGGTCATTG 


ATGAGAACTA 


TAAGTCAGCA 


ACGGGTAAGA 


5220 


TTCTTTACGT 


AGAAAAAACA 


CAATTTAACA 


AGGTTGCAGA 


GGTCTTTCAT 


AAGTATCTGG 


5280 


ACATGGAAGA ATCCTATGTA 


AGAGAGCAAC 


TCTCGCAACC 


TAATCTCAAG 


C AAGTTTCCT 


5340 


TTGGAGCAAA 


GGGAAATGGG 


ATTACCTATG 


CCAATATGAT 


GTCTATCAAA AAAGAATTGG 


5400 


AAGCTGCAGA 


GGTCAAGGGG 


ATTGATTTTA 


CAACCAGTCC 


CAATCGTAGT 


TACCCAAACG 


5460 


GACAATTTGC 


TTCTAGTTTT 


ATCGGTCTAG 


CTCAGCTCCA TGAAAATGAA GATGGAAGCA 


5520 


AGAGCTTGCT 


GGGAACCTCT 


GGAATGGAGA 


GTTCCTTGAA 


CAGTATTCTT 


GCAGGGACAG 


5580 


ACGGCATTAT 


TACCTATGAA 


AAGGATCGTC 


TGGGTAATAT 


TGTACCCGGA 


ACAGAACAAG 


5640 


TTTCCCAACG 


AACGATGGAC 


GGTAAGGATG 


TTTATACAAC 


CATTTGCAGC 


CCCCTCCAGT 


5700 


CCTTTATGGA 


AACCCAGATG 


GATGCTTTTC 


AAGAGAAGGT 


AAAAGGAAAG 


TACATGACAG 


5760 


CGACTTTGGT 


CAGTGCTAAA 


ACAGGGGAAA 


TTCTGGCAAC 


AACGCAACGA 


CCGACCTTTG 


5820 


ATGCAGATAC 


AAAAGAAGGC 


ATTACAGAGG 


ACTTTGTTTG 


GCGTGATATC 


CTTTACCAAA 


5880 


GTAACTATGA 


GCCAGGTTCC 


ACTATGAAAG 


TGATGATGTT 


GGCTGCTGCT 


ATTGATAATA 


5940 


ATACCTTTCC 


AGGAGGAGAA 


GTCTTTAATA 


GTAGTGAGTT 


AAAAATTGCA 


GATGCGACGA 


6000 


TTCGAGATTG 


GGACGTTAAT 


GAAGGATTGA 


CTGGTGGCAG 


AACGATGACT 


TTTTCTCAAG 


6060 


GTTTTGCACA 


CTCAAGTAAC 


GTTGGGATGA 


CCCTCCTTGA GCAAAAGATG GGAGATGCTA 


6120 


CCTGGCTTGA 


TTATCTTAAT 


CGTTTTAAAT 


TTGGAGTTCC 


GACGCGTTTC 


GGTTTGACGG 


6180 


ATGAGTATGC 


TGGTCAGCTT 


CCTGCGGATA ATATTGTCAA CATTGCGCAA AGCTCATTTG 


6240 


GACAAGGGAT 


TTCAGTGACC 


CAGACGCAAA 


TGATTCGTGC 


CTTTACAGCT 


ATTGCTAATG 


6300 


ACGGTGTCAT 


GCTGGAGCCT 


AAATTTATTA 


GTGCCATTTA 


TGATCCAAAT 


GATCAAACTG 


6360 


CTCGGAAATC 


TCAAAAAGAA 


ATTGTGGGAA 


ATCCTGTTTC 


TAAAGATGCA 


GCTAGTCTAA 


6420 


CTCGGACTAA 


CATGGTTTTG 


GTAGGGACGG 


ATCCGGTTTA 


TGGAACCATG 


TATAACCACA 


6480 


GCACAGGCAA 


GCCAACTGTA 


ACTGTTCCTG 


GGCAAAATGT 


AGCCCTCAAG 


TCTGGTACGG 


6540 
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CTCAGATTGC TGACGAGAAA AATGGTGGTT ATCTAGTCGG GTTAACCGAC TATATTTTCT 6600 

CGGCTGTATC GATGAGTCCG GCTGAAAATC CTGATTTTAT CTTGTATGTG ACGGTCCAAC 6660 

AACCTGAACA TTATTCAGGT ATTCAGTTGG GAGAATTTGC CAATCCTATC TTGGAGCGGG 6720 

CTTCAGCTAT GAAAGACTCT CTCAATCTTC AAACAACAGC TAAGGCTTTA GAGCAAGTAA 6780 

GTCAACAAAG TCCTTATCCT ATGCCTAGTG TCAAGGATAT TTCACCTGGT GATTTAGCAG 6840 

AAGAATTGCG TCGCAATCTT GTACAACCCA TCGTTGTGGG AACAGGAACG AAGATTAAAA 6900 

ACAGTTCTGC TGAAGAAGGG AAGAATCTTG CCCCGAACCA GCAAGTCCTT ATCTTATCTG 6960 

ATAAAGCAGA GGAGGTTCCA GATATGTATG GTTGGACAAA GGAGACTGCT GAGACCCTTG 7020 

CTAAGTGGCT CAATATAGAA CTTGAATTTC AAGGTTCGGG CTCTACTGTG CAGAAGCAAG 7080 

ATGTTCGTGC TAACACAGCT ATCAAGGACA TTAAAAAAAT TACATTAACT TTAGGAGACT 7140 

AATATGTTTA TTTCCATCAG TGCTGGAATT GTGACATTTT TACTAACTTT AGTAGAAATT 7200 

CCGGCCTTTA TCCAATTTTA TAGAAAGGCG CAAATTACAG GCCAGCAGAT GCATGAGGAT 7260 

GTCAAACAGC ATCAGGCAAA AGCTGGGACT CCTACAATGG GAGGTTTGGT TTTCTTGATT 7320 

ACTTCTGTTT TGGTTGCTTT CTTTTTCGCC CTATTTAGTA GCCAATTCAG CAATAATGTG 7380 

GGAATGATTT TGTTCATCTT GGTCTTGTAT GGCTTGGTCG GATTTTTAGA TGACTTTCTC 7440 

AAGGTCTTTC GTAAAATCAA TGAGGGGCTT AATCCTAAGC AAAAATTAGC TCTTCAGCTT 7500 

CTAGGTGGAG TTATCTTCTA TCTTTTCTAT GAGCGCGGTG GCGATATCCT GTCTGTCTTT 7560 

GGTTATCCAG TTCATTTGGG ATTTTTCTAT ATTTTCTTCG CTCTTTTCTG GCTAGTCGGT 7620 

TTTTCAAACG CAGTAAACTT GACAGACGGT GTTGACGGTT TAGCTAGTAT TTCCGTTGTG 7680 

ATTAGTTTGT CTGCCTATGG AGTTATTGCC TATGTGCAAG GTCAGATGGA TATTCTTCTA 7740 

GTGATTCTTG CCATGATTGG TGGTTTGCTC GGTTTCTTCA TCTTTAACCA TAAGCCTGCC 7800 

AAGGTCTTTA TGGGTGATGT GGGAAGTTTG GCCCTAGGTG GGATGCTGGC AGCTATCTCT 7860 

ATGGCTCTCC ACCAAGAATG GACTCTCTTG ATTATCGGAA TTGTGTATGT TTTTGAAACA 7920 

ACTTCTGTTA TGATGCAAGT CAGTTATTTC AAACTGACAG GTGGTAAACG TATTTTCCGT 7980 

ATGACGCCTG TACATCACCA TTTTGAGCTT GGGGGATTGT CTGGTAAAGG AAATCCTTGG 8040 

AGCGAGTGGA AGGTTGACTT CTTCTTTTGG GGAGTGGGAC TTCTAGCAAG TCTCCTGACC 8100 

CTAGCAATTT TATATTTGAT GTAAGAATGG CACCCTGATG TTTCAGGG 8148 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9909 base pairs 
<B> TYPE: nucleic acid 
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(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

TACTCCACCC TTAATATCCG TTCCTGTAAA TACTTTACCG CTTTTAAGTT CATAGAATTG 60 

AACTTTTAAA TGCTTGTCTT CAAGCATCTT TTCCATCCAA TTTTTAGGAG TTTGACCAGC 120 

TTTAAATAAA AACCTTGCTG GGGTGATTAG TATAGATTTA TCTGCGATTT TATAAGCTTC 180 

ATCAATAAAA TAGTGATATA TCGGCTCATC TCTGGCTTCT CCTGTTTCCT GATACGGAGG 240 

ATTTCCTATC ACGACATCAA ATTTCATTTC ACTTTCCTCG CTAGATAGGC GCTCAAAACC 300 

TATCATTCTA TTCTTTTTCC AGTCTTTGAT ATGGGTTTTA GATTCTTCTA CTTCTTGGAC 360 

TTCTAGCTCA TCCGCAAACA AACTCAATTG TTGAGATTGC TTTTGTTTAG CTGAATAAGG 420 

ACTACTTTTT TTCAATCCAT CCATCTGAAA GACATTGTAA GAGATAATAG TCGCAATTTC 480 

TTTCTTTTGC TCTAATGTTG GTTGATTTCC AGTCTTAGCT AGATAATAGT CCTCAAAAGT 540 

TGCCAAAAGA TTCTCACGCG CCAAAAGGAG AGAATCTCCT TGATACTCAT AACCATACGA 600 

AGCATGATAA GCATCTTTTA CAAGTTTATA AAATGTGACT TCATCTGAAA CCTCACGACT 660 

AATCCGTTGC AGTTTTCTAT CAACAAAACC AACTCGCTCA GATAATGGAA TTTCCTCACC 720 

AGTTACGGTA TCATATCTCG TTACCATATA AGGTGCTTCA CCACAAGTTA CCTCTAACCA 780 

TCGTAAGTCC ACATACTCCT CAAGACTTAA CGAGCCTAAT TTCGATTCTA CATATCCATT 840 

TTGCTTTGCG ACCAACCACG TTGGTGTAAA CACTTCTGCC CTTATTTTTG TCCGATCTTT 900 

TTGTTCATAT TTGGATTTTT CAGATCTGGG CTGAATCAAG TTGGCAAAGT TTCCAGTAAC 960 

CTTACTTGGA TTGATGCGAT CACTTGGAGC AAATCCCTTT CCTAACAATT CATAAGAATG 1020 

CGTAnGCCAA ACAATTGATT TCTTTGTCGT TCGATCTTTT AAAAGAATTT TTAATAAGTC 1080 

AGCCGATTCT TTAGCCAAAC TTTCTTCACT AATATCTATT GTCATCAGCA ACCTCTCTTA 1140 

TATTGTAAGC CCTATTATAT CATATTTTAA AGAATGAAAA TTTACTTGAA AAAAGTAATT 1200 

CAATAAATAT CTCTCCGATG ACCAACTTCT AGAGTAGCAA CGACTAATTC ATCATCTACA 1260 

ATTTGTACGA TAACTCGATA ATTACCAATT CTATAGCGCC ATTGACCAAC GCGATTACCA 1320 

ACCAAAGCCT TTCCGTGTCG TCTTGGGTCT TCCAAAACAT TGGTTTGTAA ATAGTTTGTA 1380 

ATTAGCTTCT GCGTATAACG GTCCAATTTT TTCAATTGCT TGATAAAACG TCTTGTTGGA 1440 

ACTAATTTAT ACAAATTATT CATCCTTCAA GCCTAAATCA TGCATCATTT CTTCCCAAGT 1500 

AATGGGTTCA ACTCCTTTTT CCAAGTCTTC TAAATACTCT TGATAGGCTA AATCTGCCAG 1560 
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ACGAGCATCG 


TATTCATCTT 


CTAGGGCTTC 


AAGAGTTTTG 


GTGCGAATAA 


GTTCCGAAAG 


1620 


GGAAACTCCT 


TCAAACTTAG 


CCATTGCTTT 


CATAAATGTT 


TTATCAGCTT 


CAGAAACTTT 


1 con 


TAATGTAATA 


GTAGTCATCT 


TTTGTGCTCC 


CTTTTTTAAT 


GGTAACACCA 


TTGTATTACT 


1740 


TTTTAGGTGT 


TCAGTCAATA 


TAAAAAGAAC 


ACCTTCTCAG 


CGTTCTTTCT 


ATATCTCTGT 


1800 


CAATGGTGTT 


GCGGTATCTG GTGAGGTATC ATAAACCTTA AAGTCTACTC CGACTCCCAG 


1860 


ATCAGCTTGA 


GCCAGCTGAT 


TGACCATGGT 


CATATGAGCC 


AGTTCCTTGA 


TATTGTTTTC 


1920 


CTTAGATAAA 


TGCCCAAGGT 


AAATCTTCTT 


AGTACGATTT 


CCTAGCGTCC 


GAATCATAGC 


1980 


TTCAGCACCG 


TCCTCGTTAG AAAGGTGACC AAGGTCAGAT AGGATTCGTT GTTTGAGTCG 


2040 


CCAAGCGTAA 


GAACCTGATC 


GCAAAATCTC 


TACATCATGG 


TTGGCCTCGA 




2100 


ATCCGCATTT 


TCGACAATGC 


CCGCCATACG 


GTCACTGACA 


TAACCTGTAT 




2160 


GACAAAACTC 


TTATCATCCT 


TCATAAAGCG 


ATAGAACTGC 


GGTGCGACTG 




2220 


TACACCAAAA 


CTCTCGATGT 


CGATATCTCC 


AAAGGTTTTG 


GTTTTACCCA 


TTTCAAAAAT 


2280 


ATGCTTTTGC GAAGAATCCA CCTTGCCAAG AT AT TT ACT A TTTTCCATAG 


CTTGCCAGGT 


2340 


CTTTTCATTG 


GCATAAAGAT 


CCATACCATA 


CTTGCGAGCC 


AAAACGCCTA 


CTCCATGGAT 


2400 


ATGATCTGAA 


TGCTCATGGG 


TAATCAAGAT 


GGCATCCAGG 


TCTTCTGGCT 


TACGGTTAAT 


2460 


TTCAGCTAGC 


AGACTGGTAA 


TTTTCTTGCC 


AGACAAGCCT 


GCATCTACTA 


AAAGCTTCTT 


2520 


TTTTGAGGTT 


TCCAGATAAA 


AAGAATTTCC 


ACTGGAACCC 


GACGCTAAAA 


TACTGTATTT 


2580 


AAAGCCTATT 


TCACTCATTC 


TAGTCTTCTA 


CTTCATCCTC 


CCATACTTCT 


TCTTTCACTG 


2640 


CATCCTTATC 


ATAAGGGAGT 


ACAATGGTAA 


AGGTTGAACC 


CTTGCCGTAT 


TCACTCTTGG 


2700 


CCCAAATAAA 


GCCCTTATGT 


TGTTTGATAA 


TTTCTTTAGC 


GATAGACAGT 


CCTAGACCTG 


2760 


TACCACCTTG 


TGCACGACTT 


CTAGCACGAT 


GCACACGATA 


GAAACGGTCA 


AAGATACGTG 


2820 


GTAAATCCTG 


CTTAGGAATC 


CCCAAACCGT 


GGTCAGAAAT 


GGATAAAATC 


ATCTGGTCTT 


2880 


CAGTTGTCTT 


CATTCTGACA 


GTGATTTTAC 


CCCCATCTGG 


CGAATACTTA 


ATAGCATTAT 


2940 


TTAAAATATT 


GTCGACAACC 


TGCGTCATCT 


TATCTGTATC 


AATTTCCATC 


CAGATAGAAT 


3000 


TGATGGGATA 


ATCTCTCACC 


AACTCATATT 


TTTTCTCCTT 


TTCCTGTCCT 


TTCATCTTGT 


3060 


CAAAACGATT 


GAGGATAAAG 


GTAATAAAAG 


CAGTGAAGTT 


AATCAGTTCC 


ACATCTAGGT 


3120 


GACTGGTAGC 


ATTATCAATA 


CGTGAAAGAT 


GGAGGAGATC 


CGTCACCATG 


CGCATCATAC 


3180 


GGTTGGTCTC 


ATCAAGAGAA 


ACCTTGATAA 


AGTCTGGTGC 


TACAGTTTCA 


CACAAAGCCC 


3240 


CCTCATCCAA 


GGCTTCAAGA 


TAGGATTTTA 


CGCTAGTCAG 


AGGAGTCCGT 


AACTCATGGC 


3300 


TAACATTGGA 


AACAAAGAGT 


CTTCGTTCGC 


GTTCTTCCTT 


CTCCTGCTCC 


GTCGTATCAT 


3360 
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GCAAAACAGC CACCAAACCT GAAATAAAGC CAGACTCTCG ACGTATCAAG GCAAAGCGAA 3420 

CTCGAAGGTT CAAATATTCG CCATTGATAT CTTGGGAATC TAGCAACAAT TCTGGACTTT 3480 

GGGTAATCAA ATCACGCAAT TCATAGTTTT CTTCTATCTT GAGCAATTCC AAAATGCTTC 3540 

TATTCAGAAC ATCTTCCTTA ACCAACCCCA GTTGCTTCTT GGCTGTATCG TTAATCATGA 3600 

TAATCTGACC CCGACGGTTA GTCGCAAGAA CCCCATCTGT CATATAAAAC AGAATACTAT 3660 

✓ 

TTAGCCTCTT ACTCTCTTGT TCTAGATTTT CCTGAGTGAG ACGAATAACC TCCGACAAGT 3720 

CATTCAAATT ATTGGTAATA TTGGTGATTT CAGACCCACC TTGCATATCA AGAACCTTGG 3780 

AATAATCTCC TGCAATCAAA TCTTTAACCT TTTGATTGAC TTGCTTCAAC TGAATATTAT 3840 

CACGTCTATT TTCCAGTAAT AAGAGGGTCA CAACAAGGAT GAAACCTAAC AAAATCAGGA 3900 

TAAAGATAAA ATCTCTGGTA AAAATGGTTT GTTTCAGTAA ATCAAGCATT ATTTCTCATG 3960 

TAATACCCTA CACCACGGCG CGTCAAGATA TACTCTGGTC GGCTGGGCGT ATCTTCAATC 4020 

TTCTCACGCA GACGTCGTAC AGTCACATCA ACTGTACGGA CATCACCAAA ATAGTCATAA 4080 

CCCCAGACAG TCTCAAGCAA GTGTTCGCGC GTGATGACTT GACCTGTATG CGATGCTAAA 4140 

TGATACAAAA GCTCAAATTC ACGATGGGTT AAGTCTAGTT CTTCGCCATA TTTTTTAGCC 4200 

ACGTAGGCGT CTGGAACAAT TTCTAAATCC CCAATTTGGA TAGGTTGAGG TTTACTATCT 4260 

GCTTCCTGAC CATCTACTGG CATAGGTTGA GAACGACGCA GAAGAGCTTT AACACGCGCC 4320 

TGCAACTCAC GATTGGAGAA GGGTTTTGTT ACATAGTCAT CTGCCCCAAG TTCCAAACCG 4380 

ATAACCTTAT CAAATTCACT ATCTTTGGCT GAAAGCATAA GAATGGGCAC ACTGCTTGTG 4440 

TTACGAATGG TCTTAGCAAC TTCTAAACCA TCAATTTCTG GAAGCATCAA ATCCAGAATA 4500 

ATAATATCTG GTTGCTCTGC TTCAAATTGC TCTAGCGCTT CACGACCATT AAAAGCAGTT 4560 

ACAACTTCGT AACCTTCCTT GGTCATATTA AACTTGATAA TATCCGAGAT TGGTTTCTCA 4620 

TCATCTACAA TTAGTATTTT TTTCATATGT TCACCTTTTT CTCTACTATT ATACCAAAAA 4680 

AATAGTCAGA AGACACAATA GCTAGTCTTG GCTACTGTCT AAGTTGGGTT GTGCATAAAG 4740 

CTGCCAGATT TTTTGTTGGG GTTTGGCAAG TGGGTAATTC TTGAATTCTT CTGGTGAAAG 4800 

CCAGCGAACT TCCCTATCTG AAAAATCATG GAAGTCACTC ACCTGACCTG CTACAATCTG 4860 

TACATGCCAT TTTCGATGAC TAAAAACATG CTGGACTGTA TCAAAACAAA CATCAAGCCA 4920 

ATCAACATCT AGGTCATAGT CCTGCTGGAA ACTCTCTTCT GGACTCGGAC CAAAGTTCAC 4980 

ACTTTCTTCC GCAACCTGAT GAAAGAGGTC AAAGTGCTCT TCTTGCGAAA AGTTATCAAC 5040 

TTCTATAAAG GGGAAATGCC AAAAACCTGC CAAGAGCTTT TCGCTTTCAT TTTTTTCAAG 5100 
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TAAAAATTGT 


CCTTGAGAAT 


TTTTCACAAC 


TAAGGCTTTA 


AGATAAATAG 


GAACCGGCTT 


^ i en 


TTTCTTAGGA GATTTAATTG GATAACGGTC 


CATGGTTCCA 


TTCTGATATG 


CCGCACTAAA 


coin 


GTCCTTGACT 


GGGCTTTCTT 


CAGGTCTGGG 


ATTTACAGGA 


GACTCAATAT 


CAGACCCTAA 


5280 


GTCCATCAAG 


GCTTGATTAA AATCACCCGG ACGATCCGGA TTAATCAAGA TCTCCATCAT 


5340 


TGCCTGAAAA 


ATTTTTCGAT 


TACTTGGAAT 


CCCAATATCG 


TGGTTGACTT 


CAAACAGACG 


5400 


CGCCAAGACC 


CGCATGACAT 


TACCATCTAC 


AGCTGGCTCA 


GGCAAGTTAA 


AAGCAATACT 


5460 


GGAAATGGCT 


CCTGCTGTGT 


AAGGTCCAAT 


CCCTTTCAAG 


CTGGAAATTC 


CTTCATAGGT 


5520 


ATTTGGAAAT 


TGGCCACCAA 


AGTCAGTCAT 


AATCTGCTGG 


GCTGCAGCCT 




5580 


AACTCGAGAA 


TAATAGCCCA 


AGCCCTCCCA 


AGCTTTCAGT 


AAACTCTCCT 




5640 


TGCCAGACTT 


TCGACAGTTG 


GAAACCAGTC 


CAAAAATCTT 


TCGTAGTAAG 


RfiATAArTfTP 


5700 


ATCCACCCTG 


GTCTGCTGAA 


GCATGATTTC 


AGATACCCAG 


ATGTGATAAG 


Ci AT*T r PT w P A C**P 
x i x i x nv> X. 


5760 


TCTCCTCCAA 


GGCAAATCTC 


TTTTGTTTTC 


ATCATACCAA 


GCGAGAAGTT 




5820 


AGAAATGACT 


TTCTCCTCCG 


GCCACATGAC 


GATACCGTAT 


TCTTTCAAAT 


CTAAPATATP 


5880 


TCTAGTATAA 


CACAGAAGGT 


TTCACCTGTC 


TTTGTATCTG 


ATTTATAATA 


TTTTCAATAG 


5940 


ATAGTATATA 


ACTTTTCTAT 


CTACTTATAC 


TCAATGAAAA 


TCAAAGAGCA 


AACTAGGAAG 


6000 


CTAGCCGCAG 


GTTGCTCAAA 


ACACTGTTTT 


GAGGTTGTGG 


ATAGAACTGA 


CAGAGTCAGT 


6060 


ATCATATAcT 


ACGGCAAGGT 


GAAGCTGACG 


TAGTTTGAAG 


AGATTTTCGA 


AGAGTATAAA 


6120 


TCTTATTGAT 


GAACTGCTTG 


CAGTCTGAGA 


AAAAATGAGC 


1 IwA LAI I A 


TTTCCAAACT 


6180 


CACTTAAAGT 


CAATTTCAAT 


CCACTAGAAC 


AAGCCTAGTA 


CAGTTCCATC 


GCTTTCAACA 


6240 


TCCATGTTGA 


GAGCTGCTGG 


ACGTTTTGGA 


AGACCTGGCA 


TGGTCATAAC 


ATCACCAGTT 


6300 


AAGGCAACGA 


TGAAGCCTGC 


ACCTAATTTT 


GGTACCAATT 


CACGAATGGT 


AATTTC AAAG 


6360 


TTTTCTGGTG 


CTCCAAGCGC 


ATTTGGATTG 


TCTGAGAAAC 


TGTATTGAGT 


TTTAGCGATA 


6420 


CAGATTGGCA 


ATTTGTCCCA 


ACCGTTTTGA ACGATTTGAG 


CAATTTGTGT 


TTGAGCTTTC 


6480 


TTCTCAAAGT 


TCACTTTGCT 


ACCACGATAG 


ATTTCAGTGA 


CAATTTTTTC 


AATCTTTTCT 


6540 


TGGACAGAAA 


GGTCATTATC 


ATACAAACGT 


TTATAGTTAG 


CTGGATTTTC 


AGCAATTGTC 


6600 


TTAACAACTG 


TTTCGGCAAG 


TGCTACTCCA 


CCTTCTGCTC 


CATCAGCCCA 


GACACTAGCC 


6660 


AATTCAACTG 


GTACATCGAT 


TGAGGCACAG 


AGTTCTTTTA 


AGGCTGCAAT 


TTCAGCTTCT 


6720 


GTATCAGATA 


CAAATTCGTT 


AATAGCTACA 


ACTGCTGGAA 


TACCGAACTT 


ACGGATATTT 


6780 


TCAACGTGGC 


GTTTCAAGTT 


AGCAAAACCT 


GCACGAACTG 


CCTCTACATT 


TTCTTCAGTC 


6840 


AGAGCGTCTT 


TAGCCACACC 


ACCATTCATC 


TTAAGGGCAC 


GAAGGGTTGC 


GACAATAACA 


6900 
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ACTGCATCTG GAGATGTTGG CAAGTTTGGT GTCTTGATAT CAAGGAATTT CTCAGCACCA 6960 

AGGTCCGCAC CAAAACCAGC TTCAGTAACA GTGTAATCAG CCAAGTGAAG GGCTGTTGTC 7020 

GTCGCCAAAA CAGAGTTACA GCCATGAGCG ATATTGGCAA ATGGACCACC GTGTACAAAG 7080 

GCAGGTGTAC CGTAAATTGT CTGAACCAAG TTTGGCTTAA TAGCATCCTT CAAAATCAAA 7140 

GCCAAGGCAC CCTCAACCTG CAAATCACCT ACAGAAACAG GCGTACGGTC ATAGCGATAA 7200 

CCAATAACGA TATTCGCCAA ACGACGTTTC AAGTCCTCGA TGTCCGTTGC CAAGCAAAGA 7260 

ATTGCCATGA TTTCTGAAGC AACTGTAATA TCAAAACCAT CCTCACGTGG AATACCGTTT 7320 

AGAGGACCAC CAAGACCAAC AGTCACATGG CGGAGCGTAC GGTCGTTCAA GTCCACAACG 7380 

CGTTTCCAGA GGATACGACG TTGATCAATT CCCAGCTCAT TCCCTTGGTG CAAGTGGTTG 7440 

TCAATCAAGG CAGAAAGGGC ATTGTTGGCA GTTGTAATAG CATGCATATC TCCAGTAAAG 7500 

TGGAGGTTGA TGTCTTCCAT TGGCAGAACT TGTGCATACC CACCACCAGC AGCACCACCG 7560 

TTGATCCCCA TGACTGGACC AAGAGACGGT TCGCGGATAG CAATCATGGT TTTCTTGCCA 7620' 

ATCTTGTTCA AGGCATCCGC AAGACCAATG GTAAGCGTCG ACTTTCCTTC ACCTGCAGGT 7680 

GTTGGGTTGA TGGCAGTAAC CAAGATCAAT TTACCGACTG GATTGCTCTC AACTGCACGA 7740 

ATTTTATCAA AGCTGAGTTT AGCCTTGTAC TTTCCGTACA ACTCCAAATC GTC ATAAGAA 7800 

ATACCAAGTT TCTCTACAAC ATCAACAATT GGCTTCAACT CAATACTCTG TGCGATTTCA 7860 

ATATCTGTTT TCATTCAAAA TTCCTCTAAC CTCTTATATG ATAATTCATT ATATCACAAA 7920 

ACAAGATTTT TAACATCCTA AAACTCTCTA AACGTTCGTA AATATCTCTG TTTTTAAGAC 7980 

TTTTAGAGTC CTTTCTTAAA TTTTATATGG CTTTATAGTT TGAAACTATA ATAAATCTTC 8040 

GTTTTTACCA AAAATTTATC ACTTTCATTT TACTTACCGC TTATTTTTGT GTACAATAGT 8100 

GCTATGAAAA TTTTAGTTAC ATCGGGCGGT ACCAGTGAAG CTATCGATAG CGTCCGCTCT 8160 

ATCACTAACC ATTCTACAGG TCACTTGGGG AAAATTATCA CAGAGACTTT GCTTTCTGCA 8220 

GGGTATGAAG TTTGTTTAAT TACGACAAAA CGAGCTCTGA AGCCAGAGCC TCATCCTAAC 8280 

CTAAGTATTC GAGAAATTAC CAATACCAAG GACCTTCTAA TAGAAATGCA AGAACGTGTT 8340 

CAGGATTATC AGGTCTTGAT CCACTCAATG GCTGTTTCTG ACTACACTCC TGTTTATATG 8400 

ACAGGGCTTG AGGAAGTTCA GGCTAGCTCC AATCTAAAAG AATTTTTAAG CAAGCAAAAT 8460 

CATCAGGCCA AGATTTCTTC AACTGATGAG GTTCAGGTTT TGTTCCTTAA AAAGACACCC 8520 

AAAATCATAT CCCTAGTCAA GGAATGGAAT CCTACTATTC ATCTGATTGG TTTCAAACTG 8580 

CTGGTTGATG TTACCGAAGA TCATCTGGTT GACATTGCAC GAAAAAGTGT TATCAAGAAT 8640 
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CAAGCAGATT TAATCATCGC GAATGACCTG ACTCAAATTT CAGCAGATCA GCACCGAGCT 8700 

ATATTTGTTG AGAAAAATCA GCTTCAAACA GTCCAGACTA AAGAAGAAAT TGCAGAACTC 8760 

CTCCTTGAAA AAATTCAAGC CTATCATTCT TAGAAAGGAA AACTATGGCA AACATTCTCT 8820 

TGGCTGTAAC GGGTTCAATC GCCTCTTATA AGTCGGCAGA TTTAGTCAGT TCTCTAAAAA 8880 

AACAAGGCCA TCAAGTCACT GTCTTAATGA CTCAGGCTGC TACAGAGTTT ATCCAACCTT 8940 

TGACACTACA GGTACTCTCA CAGAATCCTG TCCACTTGGA TGTCATGAAG GAACCCTATC 9000 

CTGATCAGGT CAATCATATC GAACTTGGAA AAAAAGCAGA TTTATTTATC GTGGTACCTG 9060 

CAACTGCTAA CACTATTGCA AAACTAGCTC ACGGATTTGC GGACAACATG GTAACCAGTA 9120 

CAGCTCTAGC CCTACCAAGT CATATTCCCA AACTAATAGC TCCTGCTATG AATACAAAAA 9180 

TGTATGACCA TCCAGTAACT CAGAATAATC TGAAAACATT AGAAACTACG GCTATCAGCT 9240 

GATTGCTCCT AAGGAATCCC TACTAGCTTG TGGAGACCAC GGACGAGGAG CTTTAGCTGA 9300 

CCTCACAATT ATTTTAGAAA GAATAAAGGA AACTATCGAT GAAAAAACGC TCTAATATTG 9360 

CACCCATTGC TATCTTTTTT GCTACCATGC TCGTGATACA CTTTCTGAGC TCACTTATCT 9420 

TTAACCTTTT TCCATTTCCA ATCAAACCGA CCATTGTTCA TATTCCTGTC ATTATTGCCA 9480 

GCATTATTTA TGGTCCACGA GTTGGGGTTA CACTTGGATT TTTGATGGGA TTACTTAGCT 9540 

TGACGGTTAA CACGATTACG ATTCTACCGA CAAGCTACCT CTTCTCTCCC TTCGTACCAA 9600 

ACGGAAACAT CTACTCAGCT ATCATTGCCA TCGTCCCACG TATTTTGATT GGTTTAACTC 9660 

CTTACTTAGT CTATAAACTG ATGAAAAACA AGACTGGTCT GATTTTAGCT GGAGCCCTTG 9720 

GTTCcTTGAC AAATACTATC TTTGTCCTTG GAGGAATCTT CTTCCTATTT GGAAATGTTT 9780 

ATAATGGAAA TATCCAACTT CTTCTGGCAA CCGTTATCTC AACAAATTCA ATTGCTGAAT 9840 

TGGTCATTTC TGCAATTCTA ACCCTAGCCA TTGTTCCACG ACTACAAACC TTGAAAAAAT 9900 

AAAAACAGG 99Q9 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TAATTTTCAT ATAATAGTAA AATAGAATGT GTGATTCAAT AATCACCTCA AATAGAAAGG 
AAATTCTATG TCAAATCTAT CTGTTAATGC AATTCGTTTT CTAGGTATTG ACGCCATTAA 



60 



120 
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TAAAGCCAAC 


TCAGGTCATC 


CAGGTGTGGT 


TATGGGAGCG 


GCTCCGATGG 


CTTACAGCCT 


180 


CTTTACAAAA 


CAACTTCATA 


TCAATCCAGC 


TCAACCAAAC 


TGGATTAACC 


GCGACCGCTT 


240 


TATTCTTTCA 


GCAGGTCATG 


GTTCAATGCT 


CCTTTATGCT 


CTTCTTCACC 


TTTCTGGTTT 


300 


TGAAGATGTC 


AGCATGGATG 


AGATTAAGAG TTTCCGTCAA 


TGGGGTTCAA 


AAACACCAGG 


360 


TCACCCAGAA 


TTTGGTCATA 


CGGCAGGGAT 


TGATGCTACG 


ACAGGTCCTC 


TAGGGCAAGG 


420 


GATTTCAACT 


GCTACTGGTT 


TTGCCCAAGC 


AGAACGTTTC 


TTGGCAGCCA 


AATATAACCG 


480 


TGAAGGTTAC 


AATATCTTTG 


ACCACTATAC 


TTACGTTATC 


TGTGGAGACG 


GAGACTTGAT 


540 


GGAAGGTGTC 


TCAAGCGAGG 


CAGCTTCATA 


CGCAGGCTTG 


CAAAAACTTG ATAAGTTGGT 


600 


TGTTCTTTAT 


GATTCAAATG 


ATATCAACTT 


GGATGGTGAG 


ACAAAGGATT 


CCTTTACAGA 


660 


AAGTGTTCGT 


GACCGTTACA 


ATGCCTACGG 


TTGGCATACT 


GCCTTGGTTG AAAATGGAAC 


720 


AGACTTGGAA 


GCCATCCATG 


CTGCTATCGA 


AACAGCAAAA 


GCTTCAGGCA AGCCATCTTT 


780 


GATTGAAGTG 


AAGAGGGTTA 


TTGGATACGG 


TTCTCCAAAC 


AAACAAGGAA 


CTAATGCTGT 


840 


ACACGGCGCC 


CCTCTTGGAG 


CAGATGAAAC 


TGCATCAACT 


CGTCAAGGGC 


TCGGTTGGGA 


900 


CTACGAACCA 


TTTGAAATTC 


CAGAACAAGT 


ATATGCTGAT 


TTCAAAGAAC 


ATGTTGCAGA 


960 


CCGTGGCGCA 


TCAGCTTATC 


AAGCTTGGAC 


TAAATTAGTT 


GCAGATTATA 


AAGAAGCTCA 


1020 


TCCAGAACTG 


GCTGCAGAAG 


TAGAAGCCAT 


CATCGACGGA 


CGTGATCCAG 


TCGAAGTGAG 


1080 


TCCAGCAGAC 


TTCCCAGCTT 


TAGAAAATGG 


TTTTtCTCAA 


GCAACT 




1126 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2520 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CCGGCAACAA AAAAGAAAAA ATCAACAGTT AAAAAAAATC TAGTCATCGT GGAGTCGCCT 60 

GCTAAGCCAA GACGATTGAA AAATATCTAG GCAGAAACTA CAAGGTTTTA GCCAGTGTCG 120 

GGCATATCCG TGATTTGAAG AAATCCAGTA TGTCCGTCGA TATTGAAAAT AATTATGAAC 180 

CGCAATATAT TAATATCCGA GGAAAAGGCC CTCTTATCAA TGACTTGAAA AAAGAAGCTA 240 

AAAAAGCTAA TAAAGTTTTT CTCGGGAGTG ACCCGGACCG TGAAGGAGAA GCGATTTCTT 300 

GGCATTTGGC CCATATTCTC AACTTGGATG AAAATGATGC CAAGCGTGTG GTCTTCAATG 360 
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AAATCACCAA ClCZJ^TflCTiC^vc a b & & RTrfm 1 <t i n i &&&r&&rn fn/-»/->m» »n« rr»<~» 
nnn*u\uu«i vj\*a MAArtAlAjt. I 1 1 IAAAIaAACC T C G T AAGATC 


GATATGGACT 


420 


TGGTCGATGC CCAACAAGCT CGTCGGATCT TGGATCGCTT GGTAGGGTAT 


TCGATTTCGC 


480 


CTATTTTGTG GAAGAAGGTC AAGAAGGGCT TGTCAGCAGG TCGCGTTCAG 


TCCATTGCCC 


540 


TTAAACTCAT CATTGACCGT GAAAATGAAA TCAATGCCTT CCAGCCAGAA 


GAATACTGGA 


600 


CAGTTGATGC TGTCTTTAAA AAGGGAACCA AACAATTTCA TGCTTCCTTC 


TATGGAGTAG 


660 


ATGGTAAAAA GATGAAACTG ACCAGCAATA ACGAAGTCAA GGAAGTCTTG 


TCTCGTCTGA 


720 


CGAGTAAAGA CTTTTCAGTA GATCAGGTGG ATAAGAAAGA GCGCAAGCGC 


AATGCTCCTT 


780 


TACCCTATAC CACTTCATCT ATGCAGATGG ATGCTGCCAA TAAAATCAAT 


TTCCGTACTC 


840 


GAAAAACCAT GATGGTTGCC CAACAGCTCT ATGAAGGAAT TAATATCGGT 


TCTGGTGTTC 


900 


AAGGTTTGAT TACCTATATG CGTACCGATT CGACTCGTAT CAGTCCTGTA GCGCAAAATG 


960 


n\jvjvAjui.ftA\> liiuaiiawj t»A 1 tAa 1 1 1 t»TAv»CAAGTA TTCTAAGCAC 


GGTAGCAAGG 


1020 


TCAAAAACGC ATCAGGTGCT CAGGATGCCC ATGAGGCTAT TCGTCCGTCA 


AGTGTCTTTA 


1080 


ATACACCAGA AAGCATCGCT AAGTATCTGG ACAAGGATCA GCTTAAGCTA 


TATACCCTTA 


1140 


TCTGGAATCG TTTTGTGGCT AGCCAGATGA CAGCGGCCGT TTTTGATACC 


ATGGCTGTTA 


1200 


AATTGTCTCA AAAAGGGGTT CAATTTGCTG CCAATGGTAG TCAGGTTAAG 


TTTGATGGTT 


1260 


ATCTTGCCAT TTATAATGAT TCTGACAAGA ATAAGATGTT ACCGGACATG 


GTTGTTGGAG 


1320 


ATGTGGTCAA ACAGGTCAAT AGCAAACCAG AGCAACATTT CACCCAACCG 


CCTGCCCGTT 


1380 


ATTCTGAAGC AACACTGATT AAAACCTTAG AGGAAAATGG GGTTGGACGT 


CCATCAACCT 


1440 


ACGCGCCAAC CATTGAAACC ATTCAGAAAC GTTATTATGT TCGCCTGGCA 


GCCAAACGTT 


1500 


TTGAACCGAC AGAGTTGGGA GAAATTGTCA ATAAGCTCAT CGTTGAATAT 


TTCCCAGATA 


1560 


TCGTAAACGT GACCTTCACA GCTGAAATGG AAGGTAAACT GGATGATGTC 


GAAGTTGGAA 


1620 


AAGAGCAGTG GCGACGGGTC ATTGATGCCT TTTACAAACC ATTCTCTAAA 


GAAGTTGCCA 


1680 


AGGCTGAAGA AGAAATGGAA AAAATCCAGA TTAAGGATGA ACCAGCTGGA 


TTTGACTGTG 


1740 


AAGTGTGTGG CAGTCCAATG GTCATTAAAC TTGGTCGTTT TGGTAAATTC 


TACGCTTGTA 


1800 


GCAATTTCCC AGATTGCCGT CATACCCAAG CAATCGTGAA AGAGATTGGT 


GTTGAGTGTC 


1860 


CAAGCTGTCA TCAGGGACAA ATTATTGAGC GAAAAACCAA GCGTAATCGC 


CTATTCTATG 


1920 . 


GTTGCAATCG CTATCCAGAA TGTGAATTTA CCTCTTGGGA CAAGCCTGTT 


GGTCGTGACT 


1980 


GTCCAAAATG TGGCAACTTC CTCATGGAGA AAAAAGTCCG TGGTGGTGGC 


AAGCAGGTTG 


2040 


TTTGTAGCAA AGGCGACTAC GAGGAAGAAA AGATGGCTCT TTGTCAACTG 


TAGTGGGTTG 


2100 


AAGTCAGCTA AGCTCGAGAA AGGACAAATT TTGTCCTTTC TTTTTTGATA 


TTCAGAGCGA 


2160 
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TAAAAATCCG TTTTTTGAAG TTTTCAAAGT TCCGAAAACC AAAGGCATTG CGCTTGATAA 2220 

GTTTGATGAG ATTATTGGTC GCTTCCAATT TGGCGTTAGA ATAGTGTAGT TGAAGGGCGT 2280 

TGACGATTTT CTCTTTGTCC TTTAGAAAGG TTTTAAAGAC AGTCTGAAAA AGAGGATGAA 2340 

CCTGCTTTAG ATTGTCCTCA ATGAGTCCGA AAAATTTCTC CGGTTCCTTA TTCTGAAAGT 2400 

GAAACAGCAA GAGTTGATAG AGCTGATAGT GATGTTTCAA GTCTTGTGAA TAGCTCAAAA 2460 

GCTTGTTTAA AATCTCTTTA TTGGTTAAAT GCATACGAAA AGTAGGGCGA TAAAAATGTT 2520 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TTTTCTCGAT AATAACTTCC ACCTTATTAT TTGGGATACC CTCCTCTTCT TCACCACCAC 60 

GTTCATAGTA GTCATCGCGA TAGAGAAAAG CTACGATATC AGCGTCCTGC TCAATAGACC 120 

CAGATTCACG AATATCAGAC AAGACCGGTC TCTTGTCCTG ACGTTGTTCT ACACCACGAG 180 

AAAGCTGACT CAGAGCGATT ACTGGAACCT TCAATTCCTT GGCTAGTATT TTCAACTGAC 240 

GAGAAATTTC AGAAACTTCT TGTTGACGAT TTTCTCGACC AGTTCCCGTG ATAAGTTGCA 300 

AATAGTCTAT CAAAATCAAA CCAAGATTTC CAGTTTCTTG AGCCAATTTA CGAGAACGAG 360 

AACGAATCTC TGTAATCCGA ATACCTGGCG TATCATCGAT ATAGATACTG GCGTTAGcTA 420 

GATTACCCTG AGCAATAGTA TATTTTTGCC ACTCCTCATC TGTCAATTGC CCTGTACGGA 480 

TAGAATGTGA CTCCACTAAG CCTTCTGCAG CTAACATACG ATCTACCAAG CTTTCCGCAC 540 

CCATTTCGAG TGAAAAAATA GCAACCGTTT TGTCCAACTT AGTCCCAATG TTCTGAGCGA 600 

TATTCAAGGC AAATGCTGTC TTACCAACTG CTGGACGAGC TGCTAAGATA ATCAACTCCT 660 

CCTCATGAAG TCCTGTTGTC ATATGATCCA AATCACGATA ACCTGTCGCA ATACCTGTAA 720 

TATCGGTCGT TTGTTGCGAG CGAGCTTCCA GATTTCCAAA GTTGAGATTC AACACATCTC 780 

GAATGTTCTT AAACCCGCTT CGATTTGCAT TTTCACTGAC ATCAATCAAC CCTTTTTCTG 840 

CCTGAGCAAT AATTTCATCA GCTGGTTGTG ACGCTTCGTA AGCTTGGTTG ACAGACTCTG 900 

TCAACTTGGC AATTAAACGA CGTAGCATTG CTTTTTCTGC AACAATCTTA GCATAATACT 960 

CCGCATTAGC AGAAGTTGGC ACAGAATTAA CAATCTCAAC CAAGTAAGAC AAGCCACCAA 1020 
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TATTCTGTAA ATCACCTTGA TTATCAAGGA TAGTACGAAC CGTTGTTGCA TCTATGGCAT 10B0 

CACCACGATC GGATAAATCG ACCATGGCTT GGAAAATCAA ACGATGGGCA TACTTAAAAA 1140 

AGTCCCGAGA CTCAATGTAT TCTCGCACAA AAACAAGTTT ACTCTCATCA ATAAAGATAG 1200 

CCCCTAAAAC GGATTGCTCA GCTAAGATAT CTTGAGGTTG TACTCGTAAC TCTTCTACTT 1260 

CTGCCATCAG ACTTCCCTTC CTTTTACAAT CTTGTCAAGA AGGTGTAAAC TTATCCTTCT 1320 

TTCACACGAA GATTGATTAC ACTTGTGATA TCTTGATAGA TTTTCACTGG CACATCAATC 1380 

AAACCAACCG CTCGAATCGG AGCTTGTACT TGAATATGAC GTTTATCAAT CTTAATTCCA 1440 

AATTGCTTTT GCAATTCTTC TGCAATCTTC TTATTGGTAA TAGAACCAAA GGTACGACCA 1500 

TCTGGACCAA CTTTTTCAAC AAATTCTACA ACAGTTTCTT CTGCTTCAAG TTGTGCTTTA 1560 

ATTGCTTTTC CTTCTGCAAT CATCTCAGCG TGAGCTTTTT CTTCCGATTT TTGTTTACCA 1620 

CGAAGTTCAC CTACAGCTTG AGCAGTCGCT TCTTTGGCTA GATTCTTTTT GATAAGAAAG 1680 

TTTTGCGCAT ACCCTGTTGG TACTTCCTTA ATTTCGCCTT TTTTACCTTT TCCTTTAACA 1740 

TCTGCTAAAA AGATTACTTT CATTCTTCTT TCTCCTTTTC CTTCATTTCA TTTAATACAA 1800 

TTTCTGTCAG TTTTTCACCT GCTTCTGACA AGGTTACATC TTTAATTTGA GCTGCTGCCA I860 

AATTAAAGTG GCCTCCACCG CCTAACTCTT CCATAATCCG TTGTACATTC AGTTTACTAC 1920 

GACTTCGAGC TGAGATAGAG ATAAATCCTT GTGTATTCTT CGCAAGAACA AAACTCGCTT 1980 

CAATACCTGA CATGGCTAAC ATGGCATCTG CTGCCTTACT AATAACAACT GTATCATAGC 2040 

ATTTCATGTC CTTAGCCTCT GCTATTAGTA CATCTGAACC TAATTTACGC CCCTGTAAAA 2100 

TAAGTTCATT GACCTCACGA TATTCTTCAA AATCTGTCGC AGCGATTTCC TGGATAGCAA 2160 

TACTATCACT TCCGCGCGTT CTGAGATAGC TAGCAACATC AAATGTCCGA CTAGTTACTC 2220 

GCGAGGTGAA ATTTTTAGTA TCCAACATCA TACCAGCCAT CAAGACACTT GCTTGCATAC 2280 

GACTCAAACG ATTTTTCTTA GAATTCTGGA ACTGAATCAA TTCCGTTACC AACTCAGTGG 2340 % 

CACTACTTGC ACCACTTTCG ATATAAGTAA TAACCGCATT ATCTGGAAAA TCCTGATCCC 2400 

TTCTATGGTG GTCAATAACA ATGGTTTGGG TAAATAAATC ATAAAATTCT TTTGATAATG 2460 

TTAAGGCTGT CTTTGAATGG TCTACAAGAA TCAACAAAGA ACGATTGGTC ACCATCCCCA 2520 

TTGCATCCTT AACAGACAAC AACTTCGTAA CTCCTTCTTT TTCTATGAAT GAAACAGCTC 2580 

GTTCAATATC TGGAGACATT TGTTCTTCAT CATAAAGAGC ATAGCTATTT TCAATCACAT 2640 

TGCTGGCGAA CAACTGCATA CCTACAGCAG AGCCCAAAGC ATCCATGTCT AAATTTTTGT 2700 

GACCGACTAC AAAAACCTGA TCTACACTCC GAATCTTATC TGAAATAGCT GTCATCATAG 2760 

CGCGCGTACG AGTCCGTGTA CGCTTGATTG AAGCAGCAGA CCCACCACCA AAATAAACTG 2820 
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GATTTTTCGT TTCGTCGTTT TCCTTAACAA CCACCTGGTC GCCACCACGT ACTTCAGCCA 2880 

AGTTCAAATT GAGCAAAGCA ACTTTCCCTA TCTCATCATG ATTTCCATCG CCATAAGAAA 2940 

ATCCCATACT TAAGGTCAAG GGCAACTGTC TCTGTTTCGA CTCTTCTCTG AAAGCATCAA 3000 

TAACAGAAAA TTTATCATTC ATCAAGCCCT CAAGCACCGT GTAGTCAGTA AATAGATAAA 3060 

ATCGATCCAT ACTTACCCGA CGAGAAAACA TCATGTGTTT TTCTGAAAAC TCTGATATAA 3120 

AATTAGCTAC AAAACTATTG ATTTGACTAA TATCTGACTC AGAAGTTTCA TCCTCCAAAT 3180 

CATCATAATT ATCCACAGAG ACAATCCCAA TCACTGGTCT ACTTGTTACC AATTCATCTG 3240 

TTATGGCTTG TTCCCTGGAT ACATCTACAA AATACAAAAC ACCGGAAGAA GCATCCATAT 3300 

GAACAGCATA ACGCTTCTCA CCAAGCTTGG CATAAGTAGA CGGATTTCCT ACTGAAGCCT 3360 

TGATAATCGT TTGAACAGCT TCTAAATCAA AATCACCATC TTCCTTGGTC AAAATCAATT 3420 

CAGCATAGGG ATTAAACCAC TCAACCTCTC CAGAAGATAA ATTCAATTTC ATAACACCTA 3480 

CAGGCATCTG TTCCAATAGA GCTGTCAAAC TTTCTTCCGC TTGGTGGTTT ACATACTGTA 3540 

TCTGTTCTAC ATCACTCCTT GTATAATGCA CTCTCAGTTT CTTAAATAAA AAAACATAGG 3600 

CTCCTACAAA AAGAAACAAA ATTAAAACCG TCAACAGATT ATTATTAACA AAAATAATGA 3660 

AAGTGGATAA GACTCCAAAC GCAATCAATC CTACTAGAAT AGGAAAAATT GGACTTACAT 3720 

AAAATTTTTT CATTCAAAAC CTCTTGGCAC CCATTATACC ATAATACCCC TCAAAAAGCG 3780 

ACTTTTTAAA AGTGTAATCA GTAATTCTAT CAATTATAAG AAAAAGGTAG TTTACAATTC 3840 

AGTAAACCTA CCTTTACACA TATTGAAATT AAGATTCTTT AACCTCTAAC AAACCAATTT 3900 

CGCCATCCTC ACGACGATAA ATCACATTGG TTGTCTGATC TTCAACATCC ACATAGATAA 3960 

AGAAATCATG CCCCAATAAA TCCATTTGTA GAATTGCTTC TTCCAAATCC ATTGGTTTTA 4020 

AATCAATTTG TTTTGAACGA ACAACTTTAG ACTGGACAAT ATTTGAATCT TCCACCAAAG 4080 

CATCTGTAAA TAATTGACCA GTTGCTACCT TATTTTTATT TTTACGCTCG ATTTTTGTTT 4140 

TATTTTTACG AATCTGACGT TCAATTTTAT CAGTTACAAG GTCAATTGAA CCATACATAT 4200 

CTTGAGATAC ATCTTCTGCG CGGAGAGTAA TAGATCCAAG CGGAATCGTT ACTTCCACTT 4260 

TAGCCGTTTT TTCACGATAA ACTTTTAAGT TAATTCGGGC ATCCAACTCT TGTTCTGGTT 4320 

GGAAGTACTT TTCGATCTTT TCGAGTTTAG AAACTACATA ATCACGAATT GCTTCTGTTA 4380 

CTTCTAGGTT TTCACCACGG ATACTATATT TAATCATATG AGTACCTTCT TTCTAAACAT 4440 

TTTTGTTTTT ATGATTTTAT TATAACGCTT TCATTCTATT TTTGCAAATT TTTTCCTCAT 4500 

CTTACAAGGG AAAATGTTTT TACATCCTTA GCACCAGCTT CTTCCAACAG TTTCTTAACA 4560 
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CGATTTATAG TTGCTCCTGT AGTATAGATA TCATCTATAA GTAGGATTTT TTTAGGAATA 4620 

GTGACTCCAC TTTTAATAAA GAAAGGAAGT TCTGTCCCCA AGCGCTCTGA ACGATTTTTA 4680 

GAAGAACTGG CTCTCTCTTC TCTTTTCTCT AATAAATCCA GATACTCAAA GCCTGCTGCC 4740 
TCTACCAAGC CCTCAACCTG ATTAAATCCT CTATTAGCAT ATCTATCAGG ACTTAGGGGA " 4800 

ATTACAACAA ATTGATACTC TTTGTACTTT TTCAACTCCT CACTTAAAAA TGAAGCGAAA 4860 

ACTTTTCTTA ACAGGAAGTC TCCATCAAAC TTATACCGAC TGAAAAAATC CTTCATAGCT 4920 

TGATTGTAAG TAAAAATCGC TCTATGACTG ACTTCAACTC CCTCTTTACA CCAAAGTTGA 4980 

CAATCTTGAC ACTTTGTTGA CAACTCTGTT TTCATACAAT TTGGACAGTT CTCTTCCCGA 5040 

ATTCTTTCAA AAGTAGAATC ACAGTCTGAA CAAAGACAAG AGTCATCATT CCTCAGAAGT 5100 

AAGAGACTAC TAAAAGTTAA AACAGTCTTC ATAGTCTGCC CACATAACAA GCACTTCATA 5160 
GACCAGCCTC CTTATTCATC ATCTGAATTT CCTTAATCGC CTTCTTGATT GAAGGATTTA - 5220 

ACCCATCATG GAAGAAAAGC AAATCTCCTG TCGGTCTATC CATGCTTCGT CCAACTCGTC 5280 

CACCAATCTG AATCAAACTA GACTTGGTAA ACAAACGATG ATTGGCCTCT ACTACGAAAA 5340 

CATCCACACA AGGGAAGGTA ACTCCGCGCT CCAAGATTGT CGTACTGATA AGTATTGTCA 5400 

GTTCTCCATC TCGAAAAGCT TGTACTTGCT CTAATCGATC CTCTGTTACA GAAGATACAA 5460 

AGCCAATTTT CTCATTTGGA AATTGCTCCT GTAAGATTTC TGCTAACTGC TCCCCTTTCT 5520 

TAATTTCTGA AGCAAAAATG AGTAACGGAT AAGCTGTCTT TCTCTGCTTC TCAATATAGG 5580 

ACTTTAACTT TGGTGACAAA CGATTCTTGT CTAAGTAGCG ATTAAAATCC GATAACCAAA 5640 

TTGGTTTTGG AATAATCAAC GGATTTCCAT GAAACCGTCT CGGTAAATTC AGTCTTTTTA 5700 

GTTCTCCTAA ACGGACCTTT TTATCTAACT CATTGGTCGA AGTCGCTGTT AAAAAGATTC 5760 

TCAATCCATT CTCCTTTACA CTATTCTTGA CAGCGTGGTA AAGCATGGGA TTATCAACAT 5820 

AAGGAAAAGC ATCTACTTCA TCCACTATCA GCAAATCAAA AGCTTGATAA AACTTCAATA 5880 

ACTGATGGGT TGTTGCAACA ACTAGTGGTG TTCGAAAATA AGGTTCCGAT TCTCCATGTA 5940 

GCAAAGCTAT CCCGCAAGAA AAATCCTGTT GCAGGCGCTT GTACAGCTCC AAACAAACAT 6000 

CTATGCGAGG ACTAGCCAAA CACACTGCAC CACCCGCATT GATCACTTTA GCCACTACTT 6060 

GATAAATCAT TTCTGTCTTT CCAGCTCCTG TTACCGCATG AACTAAGGTT GGCTTTTGCT 6120 

TGTCTACTAC TTGAAGCAAT CCCTCTGACA CCTTCTCTTG AAAAGGAGTT AATTGGCCGC 6180 

GCCATTTGAG AACATCTTGC TTTGGAAAAT CCTCCTGCGG AAAATAGTAT AAAGTTTGAT 6240 

CACTTCTGAC TCGCTTCATC AGCAAGCACT CTCGACAATA GTAAGCACCG ATGGGCAAAT 6300 

ACCATTCTTC TAGAATAGTA CTATTACAGC GTTGACAGAA AAGTTTCCCC TTCTCCTTTC 6360 
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TCATTGCTGG 


AAGTTTCTCC 


GCCAACTGAC 


GTTCTTCTTC 


TGTTAATTCA 


TTCTCAGTAA 


6420 


ATAAACGACC 


GAGATAATCT 


AAATTTACTT 


TCATACTTCT 


TTATTCGTAA 


AAACTAGCAC 


6480 


TTTAGATGAT 


TTTTTAGTAC 


AATTAAATCA 


TGGAATTTAG 


GACAATTAAA 


GAGGACGGTC 


6540 


AAGTCCAAGA AGAAATCAAA AAATCTCGCT TTATCTGCCA TGCCAAGCGT 


GTTTATAGCG 


6600 


AAGAAGAGGC 


TCGTGACTTC 


ATTACTGCCA TCAAAAAAGA 


ACACTACAAA 


GCGACACATA 


6660 


ACTGCTCTGC 


CTTCATTATT 


GGAGAACGTA 


GTGAAATTAA 


ACGTACAAGT 


GATGATGGTG 


6720 


AGCCTAGTGG 


TACTGCTGGT 


GTTCCCATGC 


TTGGGGTACT 


AGAAAATCAC 


AATCTCACCA 


6780 


ATGTCTGTGT 


GGTCGTGACA 


CGCTACTTTG 


GTGGTATTAA 


ACTAGGCGCT 


GGAGGACTAA 


6840 


TTCGTGCTTA 


CGCCGGCAGT 


GTCGCCTTAG 


CTGTCAAAGA 


AATTGGTATT 


ATTGAAATAA 


6900 


AAGAACAGGC 


TGGCATTGCT 


ATTCAAATGT 


CTTATGCTCA 


GTACCAAGAG 


TACAGTAACT 


6960 


TCCTTAAAGA 


ACATGGTCTC 


ATGGAGCTGG 


ATACAAACTT 


TACAGATCAA 


GTCGATACGA 


7020 


TGATTTATGT 


TGATAAAGAA 


GAAAAAGAAA 


CTATTAAAGC 


TGCACTTGTG 


GAGTTTTTTA 


7080 


ATGGAAAAGT 


CACTTTAACT 


GACCAAGGTT 


TACGAGAGGT 


TGAAGTTCCT 


GTAAACTTAG 


7140 


TGTAAACAAT 


GAATAATACA 


GCGTTTCGTT 


GACATTCTCA 


CAACTACTTT 


AGCGAGCAAA 


7200 


ATAAAAAGAG 


GCGTACCAAA 


ATATACTAGA 


AAATGAAGCA 


ATTCAAACGA 


AACCTGATAT 


7260 


CGTTTTCCTT 


CACACCTATT 


TACTAGAATT 


AGCTGAACGC 


AATCACTTGA 


AAATTAATGA 


7320 


CTTTGATCTA 


TGATATATAG 


AAATGGTATG 


GATAGCGTTA 


TACTAAAGAT 


ATCTTATACA 


7380 


AAGAGGTATT 


CATATGTCTA 


TTTATAACAA 


CATTACTGAA 


TTAATCGGTG 


AAACACCGAT 


7440 


TGTTAAACTT 


AACAACATCG 


TGCCAGAAGG 


TGCTGCAGAC 


GTCTATATAA 


AGCTTGAAGC 


7500 


ATTTAATCCT 


GGTTCATCTG 


TAAAAGACCG 


TATTGCCCTT 


AGCATGATTG 


AAAAAGCTGA 


7560 


ACAAGATGGT 


ATTCTGAAAC 


CTGGTTCTAC 


TATTGTTGAA 


GCAACAAGTG 


GAAACACCGG 


7620 


TATTGGACTT 


TCATGGGTAG 


GTGCTGCTAA 


AGGGTATAAA 


GTCGTCATCG 


TTATGCCTGA 


7680 


AACTATGAGT 


GTAGAACGAC 


GTAAAATTAT 


CCAAGCTTAT 


GGTGCTGAAC 


TCGTCCTAAC 


7740 


TCCTGGTAGC 


GAGGGAATGA 


AAGGTGCTAT 


TGCTAAGGCT 


CAAGAAATCG 


CTGCTGAACG 


7800 


TGATGGTTTC 


CTTCCTCTTC 


AATTTGACAA 


TCCAGCTAAT 


CCAGAAGTAC 


ACGAAAGAAC 


7860 


AACAGGAGCT 


GAGATACTAG 


CTGCTTTCGG 


TAAAGATGGA 


TTAGATGCCT 




7920 


AGTAGGTACT 


GGTGGAACGA 


TTTCTGGTGT 


TTCTCATGCA 


CTCAAATCAG 


AAAATTCTAA 


7980 


CATTCAAGTT 


TTTGCAGTAG 


AAGCAGATGA 


ATCTGCTATT 


CTATCTGGTG 


AAAAAGCTGG 


8040 


TCCTCACAAA 


ATTCAAGGTA 


TCTCAGCTGG 


ATTTATTCCT 


GATACACTTG 


ATACTAAAGC 


8100 
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CTATGATGGT 


ATCGTTCGTG 


TAACATCAGA 


TGACGCTCTT 


GCACTCGGAC 


GTGAAATTGG 


8160 


TGGAAAAGAA GGCTTCCTTG TAGGGATTTC 


CTCAGCTGCA 


GCTATCTACG 


GAGCCATCGA 


8220 


GGTTGCCAAA 


AAATTAGGTA 


CAGGTAAAAA 


AGTCCTTGCC 


CTAGCACCAG 


ATAACGGTGA 


8280 


ACGTTATCTC 


TCTACAGCAC 


TTTATGAATT 


GTAACCGTCC 


AATAACGAAG 


TCTATTGAAA 


8340 


AATCTCCAGA CTAGAGAACT 


CACGGATAGT 


TCCTAATCTG 


GAGATTTCTT 


ATTTGCACTT 


8400 


TTCTTGTACA 


ACTTTAGTCC 


ATGGTAAATA 


GGCCTCTAAA 


ACCTCTTTGT 


TTACGAGAGT 


8460 


TTCCACGTTT 


GGAAGACATT 


CTAGAAGATA 


GGATAGATAT 


TTCTCACTAT 


TTATAATGGA 


8520 


TTGAAATAAG 


ATATGAACAA 


ATCGATTAGA 


ACATGATGGT 


AAAGCGTAAT 


CCCTTGTTTC 


8580 


TCAGCTTTCC 


CAGACAAAAA 


AGTCCAATAG 


TAAGTCAGCT 


GACTATCACT 


CTCTAGCACC 


8640 


CTATAAGAAG 


TTTCATCCGC 


ATGAAGTAAG 


GGCTGAGTCA 


ATAGTCTCTC 


TCGCAAGAGG 


8700 


TTATAAAGGG 


GCTCCAAATA 


GTATTGACTC 


GTCTTGATAT GCCAATTAGA GATTTCCTTA 


8760 


CGTGTGATTG GTAAACCCAT CCTAGCCCAA 


TCTTCTTCTT 


GGCGATAATT 


GGGTACCTTC 


8820 


AGATTAAACT 


TCTGATGGAT 


GGTGTGAGCG 


ATAATAGAAG CTGAGCCAAA GTTATGCGCT 


8880 


AAAGGGGCTT 


TAGGAATAGG 


AGCTTTCACA 


AGCTTATCCA 


GATGATTATC 


TTTTACTCGT 


8940 


TATGGACAAT 


GCTATATGGC 


ATAAATCAAG 


TACCTTAAAG 


ATTCCGACTA 


ATATTGGCTT 


9000 


TGCATTTATT 


CCTCCATACA 


CACCAGAGAT 


GAACCCCATT 


GAACAAGTGT 


GGAAAGAGAT 


9060 


TCGTAAACGT 


GGATTTAAGA 


ATAAAGCCTT 


TCGAACTTTG 


GAAGATGTGA 


TACAAGGACT 


9120 


GGAGAAGGAG 


GTGATAAAGT 


CCATCGTTAA 


TCGGAGACGG 


ACTAGAATGC 


TTTTTGAAAA 


9180 


CAGATGAGTA 


TAAAAAGAAA 


GTCCTCATTT 


CAATAGAAAT 


CACGACTTTC 


TGATGAATTT 


9240 


ATAGTAAAAT 


GAAATAAGAA 


CAGGATAGTC 


AAATCGATTT 


CTAACAATGT 


TTTAGAAGCA 


9300 


GAGGTGTACT 


ATTCTAGTTT 


AAATCCACTA 


TATTTGGGGA 


GTGATAGAAA 


AGCCCTTCAT 


9360 


CAGCCAATCT 


ACTTGTTCAG 


GTGCGAGAGC 


TTTGACATCC 


TTTTCTGTAC 


TGGACCAAGT 


9420 


CAGTTTTCCG 


TTCTCAAAGC 


GTTTATATAA 


TATCCAAAAT 


CCTTGACCAT 


CCC AGTAAAG 


9480 


AACTTTAAAG 


CGGTCTTTAC 


GTCCACCACA 


AAAGAGAAAG 


ACTTGATCGG 


AGAAAGGATC 


9540 


CAATTCAAAG 


TGGGTTTTAA 


CTACATAGGC 


TAATGAGTCT 


ATTCCCTGCC 


TCATATCTGT 


9600 


CTTGCCACAA 


ACAAGGTGAA 


CTTGACCTAA 


ATCACTTAGT 


TGAATTATCA 


TAGTACAATA 


9660 


CCTTTCCTCC 


GATAATTATT 


TTTTATCTGG 


TATACTGGAA 


GTTGGGGAAT 


TAGGATAGAT 


9720 


ACCTTGTTAT 


GACGCGCTTA 


CTATGAATTT 


GAAGTATAGT 


CTCCTAAATG 


CACTTAGCCC 


9780 


TTATTATAGG 


GGTTTTTGTT 


TTAATTATTC 


TAATCGAGTG 


AGACTGGGGA 


AAAAACAATT 


9840 


TCAGGAAAAA 


TCTAAGCCCT 


ATACAAAAAA 


GGAAGCAATT 


TGCTTCCTTT 


CTATTATTAG 


9900 



WO 98/18931 



PCT/US97/19588 



239 

TTATTCAAGG CTGCTGCCAT TGTAGCTGCA ACTTCAGCTT CGAAGTCGTT TGCAGCTTTC 9960 

TCGATACCTT CACCAACTTC AAAGCGAGCA AACTCAACTA CCGAAGCGTT AACTGATTCA 10020 

AGGTATGCTT CAACTGTCTT GCTGTCATCC ATGATGTAAA CTTGTGCAAG AAGTGTGTAA 10080 

GCTTGGTCAA CTTTAGTGTT ATCAAGCATG AAGCGATCCA TTTTACCTGG AATAATTTTG 10140 

TCCCAGATTT TTTCTGGTTT GCCTTCTGCA GCCAATTCAG CTTTGATGTC AGCTTCAGCT 10200 

TGAGCAATAA CATCATCAGT TAATTGAGCT TTTGATCCAT ACTTCAAGTG TGGAAGAGGT 10260 

GGTTTATTAA CCATTGCAGG GCTTTCGTTG TCTTGGTCGA TAACGTGATT CAATTGTGCG 10320 

AACTCATCTT TAACGAATTG CTCATCCAAT TCTTTGTAAG AAAGAAGTGT TGGTTTCATC 10380 

GCTGCGATGT GCATTGACAA TTGTTTAGCA AGTGCTTCGT CTCCACCTTC AACAACTGAA 10440 

ATAACACCGA TACGTCCACC GTTATGTTGG TATGCTCCAA AGTGTTGTGC GTCTGTTTTT 10500 

TCAATCAATG CAAAGCGACG GAATGAGATT TTCTCTCCGA TAGTTGCTGT TGCAGATACG 10560 

TATGCAGCTT CAAGAGTTTC ACCTGAAGGC ATTATCAAAG CAAGAGCTTC TTCGTTGTTA 10620 

GCAGGTTTTC CTTCAGCAAT GACTTTAGCT GTAGTATTTA CCAATTCAAC GAATTGAGCG 10680 

TTTTTTGCAA CGAAGTCAGT TTCAGCGTTT ACTTCAATAA CTGCTGCAAC ATTACCGTTA 10740 

ACATAAACAC CAGTCAAACC TTCTGCAGCA ACACGGTCAG CTTTCTTAGC TGCCTTAGCC 10800 

ATACCTTTTT CACGAAGCAA TTCAATCGCT TTTTCGATGT CACCGTCTGT TTCTACAAGC 10860 

GCTTTTTTAG CGTCCATAAC ACCGGCACCA GATTTTTCAC GCAACTCTTT TACAAGTTTA 10920 

GCTGTAATTT CTGCCATTTT AATTCTCCTA TATTTTTTGA AAATAGGAGA GCGCGGCTAA 10980 

GCCCCGCCTC CGG 10993 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8411 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CGACGGGGAG GTTTGGCACC TCGATGTCGG CTCGTCGCAT CCTGGGGCTG TAGTCGGTCC 60 

CAAGGGTTGG GCTGTTCGCC CATTAAAGCG GCACGCGAGC TGGGTTCAGA ACGTCGTGAG 120 

ACAGTTCGGT CCCTATCCGT CGCGGGCGTA GGAAATTTGA GAGGATCTGC TCCTAGTACG 180 

AGAGGACCAG AGTGGACTTA CCGCTGGTGT ACCAGTTGTC TTGCCAAAGG CATCGCTGGG 240 
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TAGCTATGTA GGGAAGGGAT AAACGCTGAA AGCATCTAAG TGTGAAACCC ACCTCAAGAT 300 

GAGATTTCCC ATGATTATAT ATCAGTAAGA GCCCTGAGAG ATGATCAGGT AGATAGGTTA 360 

GAAGTGGAAG TGTGGCGACA CATGTAGCGG ACTAATACTA ATAGCTCGAG GACTTATCCA 420 

AAGTAACTGA GAATATGAAA GCGAACGGTT TTCTTAAATT GAATAGATAT TCAATTTTGA 480 

GTAGGTATTA CTCAGAGTTA AGTGACGATA GCCTAGGAGA TACACCTGTA CCCATGCCGA 540 

ACACAGAAGT TAAGCCCTAG AACGCCGGAA GTAGTTGGGG GTTGCCCCCT GTGAGATAGG 600 

GAAGTCGCTT AGCTTTAATC CGCCATAGCT CAGTTGGTAG TAGCGCATGA CTGTTAATCA 660 

TGATGTCGTA GGTTCGAGTC CTACTGGCGG AGTAATtGAT AAAAGGGaAC ACAGCTGTGT 720 

TCCTCTTTTT GTATCAATTT GTATCACCAA GCATTTTCAT AAGGAAGTCT GTTATTTCTT 780 

GAGAACTTTC TTTTTTTCCA TGTGCAATCC AAGTTTGGCA GACACCAAAA AGTGCATGAG 840 

TTAGATAGAT GCTACTATAT TCTAATTCAG TGGTATTTAG ATTCAGTTGC ATAAATCGCT 900 

TTTGTAAATC TGTACTAAGC ATGATATGAA GTTTATTTCG TAAGAAATTT TGGATTTCTT 960 

TAGTCCCATT TTCAGAAAGA AGGGCAGCCA GAAGTGGTTC TGACTCTAGA TATTCAAAAA 1020 

CTTCTAAAAT AGCGTCTCTT TTGTGATGAG CATGTTTTTG AAAAATATAT TCAAATGTAT 1080 

GGAATAGCTT GCTTTGATAG TGCTCAATCA TATCATACTT ATCCTTATAG TGAGTATAGA 1140 

AGCTGGAACG ACTAATTCCG GCTTTTTCTA CTAATTTGAC AGTAGAAATT TTATCAAATG 1200 

GCTGTTCCAT CAGTAATTGT ACCATAGCAT TTTCAATAGT TCGCTTTGTT TTTAAGCGTT 1260 

TGTTACTTTC TTGCATATTT CCTCCTTGTA AACAAATTAG ACTATATGTC TAAAAATAGA 1320 

TTTTTTATCT TGTAATTTAG ATTTTTTAAT GTATAATCTA TTATATCAAA ATTTTAGACA 1380 

ATATGTTTAA AAAAGGAGAA ACTAAGTTTA AAGAATGGAA AGCAATTTAA AAAAAACCAA 1440 

CCTTTATTAT TGTCATGATC GGGATTTCTC TTATTCCAGA TCTGTACAAT ATCATATTTT 1500 

TGTCATCAAT GTGGGATCCA TATGGGCAAT TGTCTGACTT ACCTGTGGCA GTTGTAAATA 1560 

ATGATAAAGA GGCTTCCTAT AATGGTAATA CTATGGCAAT AGGAAAAGAC ATGGTGTCCA 1620 

ATTTAAAAGA AAATAAAACC TTGGATTTTC ATTTTGTAGA TGAAGAGGAA GGAAAGAAGG 1680 

GATTGGAAGA TGGCGATTAC TATATGGTAG TGACTTTACC AAGTGATTTA TCTGAAAAAA 1740 

CAACTACATT ATCCAATATT CAATCGACAG CAGCTTATCA ATCATTGACA AGTGAGCAAC 1800 

AAACTGAGAT AAGTGATTCT GTATCTCAAA ATTCAACTGA TAGTATTCAA TCGGCTCAGT I860 

CAATTGTAGC TTTAGTACAA GATTTACAGG GAAGTTTAGA AAACTTACAA AATCAATCTT 1920 

CTAATCTTTC GACTTTAAAA AATCAATCTA ATCAAGTATC ACCTATTACT TCTACTTCTT 1980 

TGATAGGATT GTCAAGTGGA TTAACAGAGA TACAAGGAGA TGTTACTAGC AAATTAGTTC 2040 
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CTGCCAGTCA GTCGATTGCA TCAGGTGTAA ACGCATATAC TACAGGTGTT 


GATAAAGTTT 


2100 


CTCAGGGCGC AAGTCAACTA AGTGAAAAAA ATGCCACCTT GACAGGTAGT 


TTGGATAAAC 


2160 


TAGTTTCAGG CTCAAACACC TTGACACAAA AATCTTCTAG ATTGACAGCA 


GGAGTTGGTT 


2220 


AATTACAATC AGGATCTGGG CAATTAGCAG ACAAATCCAG TCAGTTACTT 


TCAGGTGCTT 


2280 


CTCCATTAGA GAATAGAGCT AATAAATTGG CAGATGGATC TGGGAAACTA 


GCAGAAGGTG 


2340 


GAACAAAGTT AACTTCTGGA TTGGAAGATT TACAGACAGG ACTTGCTTCT 


TTAGGACAAG 


2400 


GACTAGGTAA TGCTAGTGAT CAACTCAAAT CAGTATCAAC ' AG AATCT AAA 


AATGCAGAGA 


2460 


TTTTGTCAAA TCCACTCAAT CTTTCAAAAA CAGACAATGA TCAAGTTCCT 


GTAAATGGAA 


2520 


TCGCAATAGC TCCTTATATG ATATCAGTTG CTCTTTTTTT GCAGCAATAT 


CAACAAATAT 


2580 


GATATTTGCG AAATTGCCTT CAGGACGTCA TCCAGAGAGC CGTTGGGCTT 


GGTTGAAATC 


2640 


TTGAGCTGAA ATAAATGGTA TTATAGCTGT TTTGGCAGGA ATTTTGGTAT 


ATGGAGGAGT 


2700 


TCAGCTTATT GGTTTAACTG CTAATCATGA GATGAGAATA TTTATTCTCA 


TCATCCTAAC 


2760 


AAGTTTAGTA TTCATGTCTA TGGTGACCAC TTTAGCAACG TGGAATAGCC 


GTATAGGAGC 


2820 


TTTTTTCTCA CTTATTTTGC TTTTACTACA GTTAGCATCA AGTGCAGGTA 


CTTATCCACT 


2880 


TGCTTTGACA AATGATTTCT TTAGATCTAT TAATCCCTGG TTACCAATGA 


GCTATTCAGT 


2940 


TTCGGGATTA CGACAAACAA TCTCTATCAA CAAGTCATTT TCCTAGCTGT 


CATACTAGTT 


3000 


CTATTTACTA GTTTAGGTAT GCTAGCCTAT CAACATAAGA AAATGGAAGA 


AGATTAAAAA 


3060 


AATCGACCGA TTAACTGGTC GATTTTTTAT GCCTTAGATG ACTTTCGTCT 


GTGATTATAG 


3120 


ATTCCAAATA GTAAGAGAGA AGTAAAGGAA CAGATTGCTC CAGTAATAAA 


ACCATTGGGA 


3180 


ATGAAGGAAA GTGTAATAGT TCCTTTCCCC TTGGGAATGT CAACTTTCAT 


AAATCCAGTT 


3240 


TGAGCTTGTT TAATTTCTAT TTTCTTACCA TCTTGGTAGG CAGACCAACC 


TTTGTCATAA 


3300 


GGAATGGTGA AGAAAATAGA TGTATCTTGT TGGACATCAT ATGTAGCAAA 


AACCTTGTTT 


3360 


TTAGAAGTTG ATACTGTGAC AGGTTGTTCT TTAATTTTTT GAATTGCCTC 


GGTGAAAGTT 


3420 


TTGGTATCTA AACGATAGAA GGTAGGAGAT TCAAATGATA CTTGTGAATT 


TCCAGGGAAA" 


3480 


CTAACATTGA TATTGAAAGT TTTTTTCTCT TTAGTATATC CTAGATTAAA 


GAAGGAGAAG 


3540 


ACATTATCAG TTGTAAAAGT CriTTTTTCA CCATTTACAA GGATGTCAAC 


CTTCTTTTGT 


3600 


TTATCGTTAG AAAAGTGAAG GTTTATGAAA GAGAGATAAA CTTGGCTGTT 


TTCTGGAACT 


3660 


TCAATTTGAT ACTGGATTGC TGCATCTTCA TTTGAAGAAC TTGTGACACT 


AATCAAATGA 


3720 


TTAGTATTTT CTATTTTTTC TGTTTTTTCA TAAGGTATTG GAGAAAAATA 


ATCAAAATTG 


3780 
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ACGTTAGCAA GTTGATTTAA AAATGAGGCC TGATTATCCA AGGTATGTTC ATTGAACTTG 3840 

ACATCATTGT AAACAGATTG ACTCGCAACT GCAATCGGAA GAGAGTATTG ATTTTCATAT 3900 

AGGGTAAGAT TATCTTTTTG ATAGATATCT TTAAAGCCAT ACTTATCAAT AGGACTGTCT 3960 

GAGATATTGT ACTGGATACC AAATAAACTA TCAGCCAAAA TACTATTATT TGCATATCGG 4020 

AGATTGAGAT TAGTCCCAGA GGATTTAAAA CCAAGTTTAT CTAAAGTAGA GCTTGATGAA 4080 

CGATTTCGAA CAGATGAAAA TTGAGAGATT CCATTGTAGT TGAATTTCAT ACTGTCATTT 4140 

CCTGTCTGAG TTTGTAGTTT TTCAGTACGA GTAAATTGAT TTCCAATATA TGTTGAGAAA 4200 

GATTCCATAG CTGGGATATC TCGACTATAA GCACTTCGAG AAGCAAATGC CCATTCCTTA 4260 

GCAATTCCGT CCATTTGAGA TGAAGCATTT AAACTCATTT CAACCAGTAT AAATAAAGAG 4320 

ATTAGAATGG CAAATAGATT CACAGATATA AACTTTTTGA TAACTGCAAG GAGTAAAAGA 4380 

GAATAGACAA CCAAAAATTC AAGAGTAAGC AGAATATTCA AATCTGTTAA AAAAGAATAA 4440 

TGCGATTTTA GATAGATGGT AGCTAAAAAT CCTGCTACTA CAAGAAAAAG CGAAACTAAA 4500 

AAATTCCAGA CTTTAAGTTC TTTCAGACGC TTTAAGACTT CTGCTGCTGT GTAAATTAAC 4560 

AAGGTAGAGA AAATCCAAGC ATAGCGATGT AAAAACATGT TTGGAGTATG CATGCCTTGC 4620 

CAAAATAAGT CAAGAGCTTC TATGTAAAAG CTTGCAATTA GAAATGCAAA GAATATTACA 4680 

TATATGAGTT TCACGTGAAA CTTAATAGAT TTCAGCGTAA AAAATAAAAT GGTCAAAATA 4740 

AAGGGAAATA GTCCAACAAA AATCATTGGG ATGGCCCCAT ACTTTGTTGT GTCAAAGGAA 4800 

CCAATGAATT GCTTAGCAAA GAGATCAAGA TACCAGCTAC TTTCAGTTTG AAACTTTGTA 4860 

ACTTCAGTCA ATTTTTCCCC ATGTGTCTGT AAATCAAATA GAGTGGGAAG AGTCATAATC 4920 

AAACTAGCCA TACCAGCTAA AAAGGAGATA ACTATGAAAT CAAGAACAGA TGATTTTCGA 4980 

GTCTTAAAGT CCCACGAAAT TTGACAGAGA TACCAGAAAA TAAGAAACAA TACTGTCATA 5040 

TATCCAAAAT AATAATTTTG AATAAATAAG ATTGACAGAC TTGTAAAGTA CAATAGGAGT 5100 

TTCTTTTCAG TTATCAGTAG ATGTAAACCA GTTATAATTA AAGGAATCAA GATAAAAACA 5160 

TCTAGCCAGG TTTTTATCTC TAATTGACTG ACAGTGAAAC TCATCAGAGC ATAGGAAGTA 5220 

GATAAGGCTA GTTTTAAAAT CTGAGGGATA GATTGAAACA ATTTATTCAA ACTAAAAAAG 5280 

GTTGACAGAC CAATCAATCC AAATTTTAAG AGAGTTGTCA GATAGATAGC ATCTGGCATA 5340 

TTCGTTAGAT CAAAAAAGTA AACCAGAGGC GCGAGAAAAC TACCCAAGTA ATAACTAGAT 5400 

AGGGCATAGA AGTTTAGCCC TAGACCACTT GTAAAGGTGT AAAACAGATT ACTATTTCCA 5460 

TGTAGGATAT TTCGTAAGGC TACATCAAAA ATAACGTATT GATGAAAGCC ATCTCCTAAT 5520 

AGAGGAGAGT TGTCGCTATT CCAGTAGATA CTTTGAGATA GATATACTCC AGACATAATC 5580 
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ACTACAGGAA TGATGAAAGA AATAAAATAG GTTCGATATG TTTTTAAAAA TGATTTCATG 5640 

TTACCTCGTA GAATGATAGA AAACTCAGTT GGTTAACCCA ACTGAGTTTT GAAGTTTTAT 5700 

TTAGTCTTTC CAAAGTTCTT TAACTTTTGC TTGTACTTCT GCATTTTCTA GGAATTCATC 5760 

GTAGGTTTCA TCGATACGGT CAATGACGCC ATTTTTAGAT AAGACAATGA TATGGTTAGC 5820 

CAAAGTTTGA ATAAATTCGT GGTCATGGCT GGCAAAGATG ATTGATTCTT TAAAGTTTTT 5880 

CAATCCATCA TTCAAGCTTG AGATAGATTC CAAGTCCAAG TGATTTGTTG GATCATCAAG 5940 

TACAAGGACA TTTGATTTTA AGAGCATGAG TTTTGAAAGC ATGACACGAA CTTTTTCTCC 6000 

CCCTGACAAG ACATTTACAG GTTTGTTAAC TTCATCTCCA GAGAAGAGCA TACGGCCGAG 6060 

GAAGCCACGT AGGAAAGTAT TGTCATGTTC TTCTTTACTT GCGAATTGAC GCAACCAGTC 6120 

AAGAATTGAT TCTCCTCCTG CAAAATCAGC TGAGTTATCT TTTGGTAGGT AAGATTGACT 6180 

AGTTGTAACT CCCCACTTGA CAGTTCCTTC ATAGTCAATA TCTCCCATGA TTGCACGAAT 6240 

TAATGCAGTC GTTTGAATAT CATTTTGTCC AATAAGTGCT GTCTTATCAT CTGGAGGCAA 6300 

GATGAAACTA ATATTATCCA AGATAGTTTC ACCATCAATG TTTACAGTTA AATTTTCTAG 6360 

TGTCAAGAGA TCATTACCAA TCTCACGTTC CGCTTTAAAG TTGATAAATG GATATTTACG 6420 

ACTAGATGGC ACAATCTCTT CTAGCTCAAT CTTATCAAGC ATTCTGTTAC GTGATGTTGG 6480 

CTGCCTTGAC TTAGAAGCAT TGGCAGAGAA ACGAGCAAGA AATTCTTGCA ATTGTTTAAT 6540 

TTTTTCTTCT GCTTTAGCAT TACGGTCTGC TAGCAATTTA GCAGCAAGCT CAGAAGATTC 6600 

CTTCCAGAAG TCGTAGTTTC CGACATAGAG TTTGATTTTT CCAAAGTCAA GGTCGGGGAT 6660 

GTGAGTACAA ACTTTGTTTA AGAAGTGACG GTCGTGGGAT ACTACGATAA CTGTGTTATC .6720 

AAAGTCAATC AAGAAGTCTT CTAACGAAGT AATCGATTGG ATATCCAAAG CGTTAGTAGG 6780 

CTCGTCCAAG AGAAGAACAT CTGGTTTACC AAAAAGTGCT TTGGCGAGGA GAACCTTTAC 6840 

TTTTTCAGCG TTGGCGAATT CGCTCATGTT TTGGTAGTGT AATTCTTCTG GAATGTTTAG 6900 

GTTTTGAAGT AGTTGAGAGG CTTCACTCTC TGGTTCCCAA CCTGCAAGTT CGGCAAACTG 6960 

TCCTTCGAGT TCGGCAGCAC GAACCCCGTC CTCGTCTGAG AAATCTTCCT TCATGTAGAT 7020 

AGCATCTTTC TCTTTCATGA TGCTATAAAG TTTTTCATTT CCCATGATAA CGAGATGAAT 7080 

GGCACGTTCA TCTTCGTAGT CAAAGTGATT TTGACGAAGA ACAGAGAGAC GTTCATCTGG 7140 

ACCAAGAGAG ATGTGACCAG TAGTAGGTTC GATATCTCCA GCTAAAATTT TTAAAAAGGT 7200 

TGATTTTCCG GCACCATTAG CACCGATTAA TCCGTAAGTA TTTCCTTCTG TAAATTTGAT 7260 

ATTGACATCA TCAAAAAGTT TGCGATCACT AAAACGTAGT GAAACATCAG ATAGTGTAAG 7320 



{ 
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CAATGTTTTT CTCCTATATG TGTAATATAT TTATTCTACT AGAAAATACA GAAATATTCA 7380 

AATTTTTATT TGTCAATTTT GTGTAAATTA TATTTACAGT ATCCTTTACA CAAATCTGTA 7440 

AAAAGCAAGG CTGATTTATT TTGATAAATT ACGGTTATTT CATTAAAAAA ATGCTATAAT 7500 

TGAAAGGACT ATATCGAAGG AGAACAAAAT GACTAAACCC ATTATTTTAA CAGGAGACCG 7560 

TCCAACAGGA AAATTGCATA TTGGACATTA TGTTGGAAGT CTCAAAAATC GAGTATTATT 7620 

ACAGGAAGAG GATAAGTATG ATATGTTTGT GTTCTTGGCT GACCAACAAG CCTTGACAGA 7680 

TCATGCCAAA GATCCTCAAA CCATTGTAGA GTCTATCGGA AATGTGGCTT TGGATTATCT 7740 

TGCAGTTGGA TTGGATCCAA ATAAGTCAAC TATTTTTATT CAAAGCCAGA TTCGAGAGTT 7800 

GGCTGAGTTG TCTATGTATT ATATGAATCT AGTTTCGTTA GCACGTTTGG AGCGAAATCC 7860 

AACAGTCAAG ACAGAGATTT CTCAGAAAGG ATTTGGAGAA AGCATTCCGA CAGGATTCTT 7920 

GGTCTATCCA ATCGCTCAAG CAGCTGATAT CACAGCTTTC AAGGCTAATT ATGTTCCTGT 

TGGGACAGAT CAGAAACCAA TGATTGAGCA AACTCGTGAA ATTGTTCGTT CTTTTAACAA 

TGCATATAAC TGTGATGTCT TGGTAGAGCC GGAAGGTATT TATCCAGAAA ATGAGAGAGG 8100 

AGGGCGTTTG CCTGGTTTAG ATGGAAATGC TAAAATGTCT AAATCACTAA ATAATGGTAT 8160 

TTATTTAGCT GATGATGCGG ATACTTTGCG TAAAAAAGTA ATGAGTATGT ATACAGATCC 8220 

AGATCATATC CGCGTTGAGG ATCCAGGTAA GATTGAGGGA AATATGGTTT TCCATTATCT 8280 

AGATGTTTTT GGTCGTCCAG AAGATGCTCA AGAAATTGCT GATATGAAAG AACGTTATCA 8340 

ACGAGGTGGT CTTGGTGATG TGAAGACCAA GCGTTATCTA CTTGAAATAT TAGAACGTGA 8400 

ACTGGGTCCG G 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9064 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



7980 
8040 



8411 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TGCCGTACTC AAGTACAGCC TGCGCTAAGT TTCCTAGTTT GCTCTTTGAT TTTCATTGAG 
TATTAGTAAC CAAAATCCGA CCACATAGCC AGCCCCTATG AATATAGCCA TTAAAGCTAG 
CATGGAATTT AGGAAATTAA AAACCACCGC AGATACAAAG GTTAGCACAA AAACATTAAA 
AGCAATGGTG TCAGAAGCCA AGACTAGAAT ATAGGGTGTC AACCGATCTA AAGTTTTGGA 
ATCTAGGAAA AATAAGTGTT TATACATGAT GACCTCCTCT ATGGCTGAAA AGCAAGCCTT 
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TTGTTTTTTT ACCCCAAGAC CCTATGTAGA AAAGTGAGCA AAAACGGGAA GGTCGCTACA 360 

ATATTATTGA TCACATGCAC CGCATAGGAT GGATAAATGC TCTTGGTATA GCGGGTCAAA 420 

CCAGCAAAGA TGATTCCAAC TGTTGCAAAG ACGAAGATAT CTAACAGACT AGGCAGGCTT 480 

GAAAAATGAG GGAGAGCAAA TAAAATAGAA GGAAGAAGCA AATCAAGACC AAATCGCGAA 540 

TGCTTAAAGA AAGCATGTTG CAGTAATCCT CTATAAATCA ATTCTTCCAT CAGTGGAACC 600 

AGAAAGAACA GGGCTATATA AATACCTAGC TCTGCAAAGT TAGTCCCACT ATAAGCAATC 660 

AATACAGCCC AACCTTCCGC AGTTGACTGA ACATGTTTAG CTGTCTGAAC GTTAAAAGAG 720 

ATCTGGAACA CTAGCACTAA TACTGTCAAA ATCGAATACC AAAGCCATTT TTTTCTTGGA 780 

ATGCGGAAGA GATAACCATG GCCTGTCTTA ACAAGAACCA CAATCATGAC TCCAATAAAA 840 

AGTAAACTCA AGATATTTTG AATCCAGAAT AAATTGCCTA TCTGAGAAGA AAATTGCCAA 900 

TAGTTTTGGA CGATAAGCGT CAGCTGAGAA AGACTAAATA CGAAAAATAA GTAAGAGAAG 960 

ACTGCACTTA TTTTGAATAG AAGTTGATAC TTTTTCATAG AAATCCTCCC TACTATGACC 1020 

TCACCTTGTC AGGCTCTACT GCTGTAAGAT TAAGAAGACA GTTTGTTTTT TTTAAGGCTA 1080 

ACCTGACTAC TAGATAATAG ATACATTAAG GCATTAAAGA CAATGAAAAT ATGTCCATAG 1140 

AATAAAATCA ACCTCGCATC CAAACCAAGA TAAAGTTTGA TTATCAAAAA GATGAGCAAA 1200 

AGAATTTGAA ACCATAAGGT TTTTCCAAAA ATAAATTTAA AGCGATTTCG AATATCTACT 1260 

TCCTTGATTT TTACCGCCAC CCCTTTATTA GCAAGAAGGA AAACTCCTGC TTCAAACAAA 1320 

CCACTGTAAA GAACAAGCCA CCCAATAGAT ACGATAGAGA TTTGTAAAAA TGTCCCTAAA 1380 

AGAATATCCA ACACACTACT CAAGAAAATA ACAAAAAATA ATCTGTATTT CATATTAAAT 1440 

ACCTCCATTC ATTTATTTCA CTAACAATTT AATAGAGCCT TCTACTCAAA TATCCTGTCA 1500 

GAAAAGGATA GAAAGCTACT TTTTATAATA CTTCAAGCCC CACATGAGCA GAAGCGTGAT 1560 

AAACAAGCAG AGAATACACC TATATAAGCG ATTAGTTGTT GATAGAATTC TGTTTCTGAA 1620 

ATACCTCTAT ACAAACAAAT GACAAACATA AAATCTGCCA AGCCGATAAA CATAAGTTGA 1680 

TTQGTTCTAG GACTAACCAA ATCATCATTT ACTTATATTT AAGAGTATCT CTTTTATTTT 1740 

AATGTATGTT AGCACTGAAA AGCAAGACAG GCCAATAATA TTTAAAATGA ACAGTAACGG 1800 

GGTTAAGTCT CTAAAAAAAT TATCTACTGA CACTACAAGA AATACTATAC ATATTATAGT I860 

CGAAACTATC TTTTTCTTAT CCATAATTAT TTACTCCTTT CCTAACAAAT CCAGCTTATC 1920 

AATCAAGAGC GATTTTTAAC ATAATGTAGC AGCACCCGTT GCAACTTTGA CAAGTTTAGT 1980 

ATATCATTGT TTTTTAAAAT TTTTCATCCA AATCTTGAAT TGTCATCGAA ACATCTTGAA 2040 
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TTGTTAAAAA ATTTAAAAAG TAAGCATTAA AAACATACTT TCCTCTTTAT ATTGTATTGA 2100 

TACCAACTTG TTTGTAGACT TTTCATCCTG CTATCACATA TCATTTTGAC AGGCGAAACA 2160 

ATATTAAAGA AACTCCCCTG TAAATTAAGC TAGCAAATAC AGGGGAGAAA TTTATTTTTT 222 0 

AGAGAGTACT ATCCGTATCC TTTTTGGAAG ATTTTGAAAA TATTTTTCTA ATTAAGTCAT 2280 

CCATATAAGG ACCAAATATA CCAACTACTA AACCAATAAT AAAACTTTTA AAATCCATAA 2340 

TTACCACCAA CATATTGCTG CATAGGCTAC ACCTCCAAGT ATAGCTCCAC CTGCAGCACC 2400 

AGTTACACCT ATTCCTATAG CAAATGGTCC CAATAGAAAT GTCAAACCGT TGTTGCACAC 2460 

CCATCAATTG CGCCATATGC AACCCCTGCT GCACAACTAA TTTTTCTTCC CCAATCAATA 2520 

TCTCCACCTT CAACGCAAGC AAGCATTTCA TTATCCATAA CTGCAAATTG TGACATCATT 2580 

TTTGTATCCA TATAGTGTAT CACTTTTCAG TTACGGAACA AGTTTAATAT AAAAATTATC 2640 

AAAAAAACAT AGGCAATAAA GAGAAAAATT AATTTATCAT AGATTAGAAA TAATATGACA 2700 

AAACAATTCA ATGATGTTAA TTCAATAGTC TTTTGTTTTT TATCGGAGAT ACTTATGGAT 2760 

AGATAAATAA GATAGGTTTG AAAAGCGAAG AGAATAATAA AGAATATAGC CTTCATAAAA 2820 

TTTAGCTTTC ATTTTTATGA TGTAGCGGTA TAGGCTAAAT ATCCACAAAC CACTGCTCCT 2880 

CCAATTCCTC CTATTGCAGC GCCCCATGGT CCTAGAAGTC TCCCATATTT CACTCCACCC 2940 

GCTGCACAAC CTAAAGCAGC AACTACAGCT GCTCGTCCGG AATTACCTCC ATAAACCTCA 3 000 

CTCAGCATTG TTTCATTTAT ATTACAATAA GTATTCATAC AAGTCTCCTT TTATTAAAAT 3060 

CCACCCGTTG CCCCTGTTAC TCCTGCCCAA AGATCCACAC CAAATTTAGC TCCTATGTAT 3120 

CCACATGCTC CCATAAATGG TGCTCCAACA CCACTCGCAG CACAAATAGC TGTCCCTAGC 3180 

CCCCAGCCAC CAAAAGCAGC ACCACCACCT TCTAAGACAT TAGTTTGCCA ATTATTCTTG 3240 

CCTCCTTCAA TACTAGATAA CATAGTTATA TCCATTTCAT GAAATTGTTG CATAATTTTT 3300 

GTATCCATGA CAAATACTCT TTTTTATTTT TAATTTTTGT CTTGTTGTAA CTTTGACAAG 3360 

TTTAGTATAT CATCGTTTTT TAAAATTTTT CATC CAG ATT TTGAATAGTC ATCGAAACGT 3420 

CTTGAATTGC AAAAATTACA TTAGACTTCC TGCAAAACTA GAATCCTAGT TCATGATTGA 3480 

TAATACCAGC ACTCAAATTC ATTCGTAATC CGAAGCGTTT ACGATGACTT CGATAGGTTG 3540 

TTGAAAACAT TTTAAACGTT TTTACTTTGG CAAAGATGTT CTCAACCTTG CTTCTCTCCT 3600 

TAGATAGCGC ATGGTTACAG GCTTTATCTT CAACTGTTAG CGGTTTGAGT TTGCTGGATT 3660 

TACGTGAAGT TTGTGCTTGA GGATATATCT TCATGAGCCC TTGATAACCA CTGTCAGCCA 3720 

AGATTTTACC AGCTTGTCCG ATATTTCTGC GACTCATTTT GAACAACTTC ATATCATGAC 3780 

AATAGTTCAC AGTGATATCC AAAGAAACAA TTCTCCCTTG ACTTGTGACA ATCGCTTGAG 3840 
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TCTTCATAGC GTGAAATTTC TTTTTACCAG AATCATTCGC TAATTCTTTT TTTAGGGCGA 


3900 


TTGATTTTTA CTTCCGTCGC ATCAATCATT ACCGTGTCCT 


CAGAACTGAG 


AGGAGTTCTT 


3960 


GAAATCGTAA CACCACTTTG 


AACAAGAGTT 


ACTTCAACCC 


ATTGGCTCCG 


ACGGAGTAAG 


4020 


TTGCTTTCGT GAACACCAAA 


ATCAGCCGCA 


ATTTCTTCAT 


AAGTGCGGTA 


TTCTCGCACA 


4080 


TATTGAAGAG TGGCCATAAG 


AAGGTCTTCT 


AGGCTTAATT 


TAGGTTTTCG 


TCCACCTTTT 


4140 


GCGTGTTTAA GTTGATAAGC 


TGTTTTTAAT 


ACAGCTAGCA 


TCTCTTCAAA 


AGTCGTGCGC 


4200 


TGAACACCAA CAAGACGCTT 


AAATCGTGCA 


TCAGTTAGTT 


GTTTACTTGC 


TTCATAATTC 


4260 


ATAGAACTAT AGTAAAATGA 


AATAAGAACA 


GGATAAATCG 


ATCAGGACAG 


TCAAATCGAT 


4320 


TTCTAACAAT GTTTTAGAAG 


TAGAGGCGTA 


CTATTCTAGT 


TTCAATCTAC 


TATACTATAC 


4380 


CATATTTTGT TTCGCAGGGA ATCTATTATA AAAGGGTAAG TATTGCAAAA ACACTTACCC 


4440 


TTTTCTTTTA TACTTCATTA 


AGCTCTACTT 


TTTATAATAC 


TTCAAGCCCC 


ACATGAGC AG 


4500 


AAGCATGATG ATTAAGCAGA 


GAACAGCGCC 


AATATAAGCG 


ATTATTTGTT 


GGTAGGATTC 


4560 


TCCTGCTGTG ATACCTCTAT 


ACAAACAAAT 


AATAGACATA 


AAACCTGTGA 


AGCCGATGAA 


4620 


CATAAGTTGA TTGGTTCTAG 


GACTAACCAA 


ATCATCATCT 


TCAAACTCTC 


TTATCCTCAT 


4680 


TTCCCTAGTG AGATAAACAG 


TAACCAAAAT 


AGAAGCCAAG 


TTAATAACTA 


CTAAAAGAAA 


4740 


TTGGAAAACT ACGGAAAAAT 


TTAAAAACTG 


ACGAGATAGA 


AATAGATAAG 


TAGAAACAAG 


4800 


CAAGGGCAAC TGACCTAAGA 


ACAATCTCGC 


AAGGAAGATG 


TTCCGTTTTT 


TAGCAAGAAA 


4860 


AGTTTTCATT TCTTTTCTCC 


TTTCTTTTTA 


TTGATAGCAA 


AATAGATCAT 


AACTGCAATG 


4920 


ACATAGGCTA TGGTATAAAA 


TAGCTGATAC 


CAAGCACTCT 


CCCTAAGCGG 


ATATAGAAAG 


4980 


ATGGACATGA TTAGATACAG 


AACGAAAATA 


ATCAGTATTT 


TTTTCTTCAT 


AAGATTTCCT 


5040 


CCTAAATGTG CGATTTATCT TAGTTGAGCA AGAACATTTA CACTGCTAGT ATAGCACTTA 


5100 


TTTTGACCTT GGATCACTCA 


AATCATAAAT 


GGTCATCAAA 


ACCTCTTGAA 


TTGTAAAAAT 


5160 


TAAAAAAGCA AGCATGAAAA ACATACTTTC CTCTTTATAT TGTATTGATA CCAACTTGTT 


5220 


TGTAGACTTT TCATCCTGCT 


ATCACATATC 


ATTTTGACAG 


GCGAAACAAT 


ATTAAAGAAA 


5280 


CTCCCCTGTA AATTAAGCTA GCAAATACAG GGGAGAAATT TATTTTTTAG AGAGTACTAT 


5340 


CCGTATCCTT TTTGGAAGAT 


TTTGAAAATA 


TTTTTCTAAT 


TAAGTCATCC 


ATATAAGGAC 


5400 


CAAATATACC AACTACTAAA 


CCAATAATAA AACTTTTAAA 


ATCCATAATT 


ACCACCAACA 


5460 


TGTTGCTGCA TAGGCTACAC 


CTCCAAGTAT 


AGCTCCACCC 


GCAGCACCAG 


TTGCTGCACG 


5520 


TTGCCATGTT CCTGTTTTAA 


TGCCTAGTTG 


AAGACCTCTT 


GCTGCTCCTC 


CTCCAACACC 


5580 
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TGCTTTGGCA AAATCTCCCC AATTGCATCC GCCACCTTCA ACGCAAGCAA GCATTTCAGT 5640 

ATCCATAACA GAAAATTGTG ACATCATTTT TGTATCCATG ACAAATACTC CTTTTTTAAA 5700 

AAACTAAAAT AAATCAGAAT AGAATCCTCA TAATTTTACT ATAAGTCTTA CCAACTTAGT 5760 

CCCAATTTAT CACCAACCAT ACCTCCTAAG CATGTTAATC CACCCCCAAT TGCACCAATG 5820 

TGTGCTCCAA CAAATGCACC AGCAAGTCCA GCTACTCCTA AAGTGGCCAA ACCTGCTCCA 5880 

GTTCCACCAG TTATAATTCC CGTAGTGACT CCTGTAATCA GTGCATTTTG ACAATCAGTG 5940 

GAGCTATACC CCCCTTCAAC TTTCGCAAGC ATTTCAGTAT CCATAACCTC TAACTGTGAC 6000 

AACATTTTTG TATTCATGAT GAATACCTCC TTTTTATTTT CAATTTGTTA CCAAAGTCTT 6060 

AAATTCAATA AACAAATAGA TTTTTTATAG TATCTTTTTG ATTTTCTTAA AAAAGTATAT 6120 

ACGTCTACTA TCTTCTTAAA GGTAGCAGTA CCTATTTTTT AGTCTAAGAT TTCAATAATC 6180 

TTGAGTATCT AAAATATCTT AATTTCGTTA TTCTCCTTGC AATAAAAAGT TTTACTATAC 6240 

TATTTATTAA CTTGCAGAAA GCAAAAAATA TTAGTAAATA ATAGTTTATA GTTAAGTTTT 6300 

TTATTCCTAC CAATCCATCA ACTAAGTAAA GCATCAACGA TTACATAAAC GATTGATAAT 63 60 

ATAATTAAAA TTTTGCTAAC TATCTTATTC TCATCATTCT TAGATAACTT TGATATTTTG 6420 

TAAGTAAGTA AATAAGACAG TAAATTAATA GCGATAATAA TACTATATTT AAGAATCATA 6480 

ATCTT AC AAA GAGGACATAA TTCCTGAACC TACACAAATA AGTGTTGCTG GTCCCCCAGT 6540 

TATCGGACCA GTCGCAGCAG CTAATAGTAC TGCTCCAATA CAACCACCGA TTGCAGATCC 6600 

TAAATTGCCT CTTCCTCCAC TAACTATTTC GAGTTCTTCA TTATCCATAA CAGAAAATTG 6660 

TTCCATCATT TTTGTATTCA TGACAAATAC TCCTTTTTTC TTTTTTTATT TTTGTCTTGT 6720 

TGTAACTTTG ATAAGTTTAG TATATCATCG TTTTTTAAAA TTTTTCATCC AGATCTTGAA 6780 

TTGTCATCGA AACGTCTTGA ATTAGCTTTT TTATTTCAAG CCACCTCTAA ATGTTTAAAA 6840 

AAAATAATTT CTAATCACTT TTTTACCATT CAGGAAGTTT TAATGACTAT TCAAGATTTC 6900 

ATAAAATATG AACTTAGTTT TATGACATAA TAGACCTATC CACTATATGA AAGGAATTGC 6960 

CAATGACTTC TTATAAACGT ACATTTGTTC CTCAAATAGA TGCGAGAGAC TGTGGTGTCG 7020 

CTGCCTTAGC CTCGATTGCT AAATTCTATG GTTCAGATTT TTCTCTAGCT CACTTGAGAG 7080 

AACTTGCAAA GACCAATAAA GAAGGGACGA CTGCTCTTGG CATTGTAAAA GCCGCTGATG 7140 

AAATGGGCTT TGAAACAAGA CCTGTTCAAG CAGATAAAAC GCTCTTTGAC ATGAGTGATG 7200 

TCCCCTATCC ATTTATCGTT CACGTTAACA AAGAAGGAAA ACTCCAACAT TACTATGTTG 7260 

TCTATCAAAC AAAGAAAGAC TATCTGATTA TTGGTGATCC TGACCCTTCT GTAAAAATCA 7320 

CTAAAATGTC AAAAGAACGC TTTTTCTATG AATGGACTGG AGTAGCTATT TTTCTAGCTA 7380 
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CCAAACCCAG CTATCAACCC CATAAAGATA AAAAGAATGG TCTACTAAGC AAGCTTCCTT 7440 

CCTCTGATTT TCAAACAAAA ATCTCTCATT GCTTACATTG TTCTCTCAAG CTTATTGGTC 7500 

ACTATTATCA ATATAGGTGG TTCTTACTAT CTCCAAGGAA TCTTGGATGA ATACATTCCA 7560 

AATCAGATGA AATCAACTTT AGGAATCATC TCAGTTGGTC TGGTTATCAC CTATATCCTC 7620 

CAACAAGTCA TGAGCTTCTC CAGAGATTAT CTCCTAACCG TTCTGAGTCA GAGATTAAGT 7680 

ATTGATGTGA TTTTATCCTA TATTCGCCAT ATTTTTGAAC TTCCCATGTC TTTCTTTGCG 7740 

ACACGTCGTA CAGGAGAAAT CATTTCACGA TTCACAGATG CTAACTCTAT TATAGATGCC 7800 

TTGGCTTCTA CCATTCTTTC TCTTTTTCTG GATGTTTCTA TTCTGATTCT TGTAGGAGGC 7860 

GTCTTACTGG CACAAAACCC TAATCTCTTC CTTCTTTCTC TTATTTCCAT TCCTATATAC 7920 

ATGTTCATCA TCTTTTCTTT TATGAAACCT TTCGAAAAAA TGAACCATGA TGTCATGCAA 7980 

AGTAATTCTA TGGTTAGCTC TGCCATTATC GAAGATATCA ACGGGATTGA AACTATAAAG 8040 

TCGCTCACGA GTGAAGAAAA TCGCTATCAA AATATAGACA GCGAATTTGT AGATTATTTG 8100 

GAAAAATCCT TTAAGCTCAG TAAATATTCT ATTTTACAAA CGAGTTTAAA GCAGGGAACA 8160 

AAATTAGTTC TGAATATCCT TATCCTATGG TTTGGCGCTC AATTAGTCAT GTCAAGTAAA 8220 

ATTTCTATCG GTCAGCTGAT TACCTTTAAC ACACTTTTTT CTTACTTTAC AACTCCTATG 8280 

GAAAATATTA TCAACCTCCA AACCAAACTC CAATCTGCGA AGGTCGCTAA TAACCGTTTG 8340 

AACGAAGTCT ATCTAGTCGA ATCTGAATTT CAAGTTCAAG AAAACCCTGT TCATTCACAT 8400 

TTTTTGATGG GCGATATTGA ATTTGATGAC CTTTCTTATA AGTATGGTTT TGGATGAGAT 8460 

ACCTTAACAG ATATTAATCT CACGATTAAA CAAGGAGATA AGGTTAGCCT AGTTGGAGTT 8520 

AGTGGTTCTG GTAAAACAAC TTTAGCCAAA ATGATTGTCA ATTTCTTTGA ACCCTACAAA 8580 

GGGCATATTT CCATCAATCA TCAGGATATT AAAAACATTG ATAAAAAAGT CTTGCGCCGT 8640 

CATATTAATT ACCTACCCCA ACAAGCCTAT ATCTTTAATG GCTCTATTTT GGAAAACTTA 8700 

ACCTTGGGCG GTAATCATAT GATTAGTCAA GAAGATATTC TAAAAGCTTG TGAAGTAGCT 8760 

GAAATCCGTC AAGACATTGA AAGAATGCCT ATGGGCTATC AAACTCAGCT CTCTGATGGA 8820 

GCTGGTCTAT CAGGAGGACA GAAGCAACGA ATCGCTCTCG CTCGTGCTCT TTTAACTAAA 8880 

TCTCCTGTTT TAATACTAGA TGAAGCTACT AGCGGTCTTG ATGTCTTGAC TGAGAAAAAG 8940 

GTTATAGATA ATCTTATGTC TCTAACTGAT AAAACCATTC TCTTTGTAGC CCATCGTCTC 9000 

AGTATAGCCG AACGAACCAA CCGTGTCATT GTTCTTGACC AGGGGAAAAT CATTGAAGTT 9060 

GGTA 9064 
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(2) INFORMATION FOR SEQ ID NO: 18: 

{i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7780 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

CTCCATTTTT TTGATTTCAT AAATAAACAA CCTCTCTGTT AATTTTGTAT AATTATAACG 60 

ATATCCAAGT TACTTGTCAA GTGTTTTTTA AATTTTTATC TCAAAAATAT TTTTTCGTTC 120 

AAAAAAAGGA GCCATCAGTT GATTTCAAGC TCCCTTTTAT ACAGAATTAA ACTATTTTAT 180 

AGTTCGACAA TCTTACCTGT TTCAAAGTAG ACAACCCATT CACAGATATT TTTAGCATAG 240 

TCACCGATAC GCTCCAAGTA GGAAATAACT TGGAAATAAT CACGACCCGT AACAATGGCT 300 

TCTGGATTTT TCTTAATCTC TTCAGTCGCA AGGTCACGGA TAGTTTCAAA ATAGTGGTTA 360 

ATTTGCTCAT CCATGGAGGC CACCCGGTAT GCGTCGTCAA CAGAACCATT AAGATAAAGA 420 

TCAAGTGCTG CTTCCACAAC GCTTTTAACT TCACGTCCCA TTTTTTTAAT TTCTTCCTCT 480 

ACAGCTGGAA TGCGCTCTTC CCCCTTCATA CGGATGGTTG CCTGGGCAAT GGCTACAGCG 540 

TGATCCCCCA TACGCTCCAC ATCTGATACA GCCTTAAGGA CAGTCAAGAC TGTACGCAAA 600 

TCTTGAGAGA CTGGTTGTTG GAGTGCGATC ATTTCAAATG ATTTCTTTTC CAGTTTCACT 660 

TCGTATTCAT TTACTTCTGC ATCATCTTCG ATGACCTCTT TTGCCAGGTC ACGGTCATGC 720 

GTGACAAAAG CACGTACCGT ACGATTGATT TGTGAGAGCA CTTCTTGTCC CATAGCGTAG 780 

AACTGGTTAT GTAATTTCTC TAAATCTTCT TCAAATTGAG ATCGTAACAT CTTTCATCTC 840 

CTTATCCAAA TTTTCCTGTA ATATAGTCTT CCGTTTCCTT GTGTTGGGGA TCAAGGAACA 900 

TCTGCTTGGT ATCATTAAAT TCAATCAAAT CTCCATCTAG GAAAAATCCT GTCTTATCAG 960 

AGATACGTGA AGCTTGCTGC ATGGAACGGG TTACCAGAAG CATGGTGTAC TTGTCTTTTA 1020 

GACCATACAA GGTTTCCTCA ATTTTACCAG CTGAAATCGG ATCCAAAGCC GAAGTTGGCT 1080 

CATCCAAGAG GATGATTTTA GGACTAGTTG CCAAGACACG GGCCACGCAG ACACGCTGCT 1140 

GTTGACCACC TGACAATCCA ATAGCTGAAT CATATAGACG ATCCTTGACC TCATCCCAGA 1200 

TAGAGGCACC TTGCAAGGCT TTTTCTACGG CTTCATCCAG AACCTGCTTA TCCTTAATTC 1260 

CATTGATACG AAGCCCGTAG ACAACATTCT CATAGATAGT CATAGGGAAA GGATTAGGTT 1320 

GTTGGAAAAC CATTCCGATT TCCTTACGTA ATTCAACCGT ATCTGTACGC GGACTGTAGA 1380 

TGTTGTGACC ATTGTACACC ACGGATCCAG TTGTGGTCAC CTCTGGATTG AGATCTCCCA 1440 
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TGCGGTTGAG AGACTTGAGG AGGGTTGACT TCCCTGATCC AGATGGACCA ATCAAGGCTG 1500 

TAATTTCCTT AGGTTGGAAA GATAGGGAAA CACTATTCAA AGCCTTCTTT TTATTATAAT 1560 

AAACGGACAG GTCTGATACC TGTAAAATCG CATCTGTCAT ACGGTTTCCT TTCTAACCAA 1620 

AGTGACCAGA TACATAGTCA TTGGTGGACT GTAGCTTGGC ATTTTGGAAA ATAGTTGCAG 1680 

TCTTGtCATA CTCAATCAAA TCACCCAAGT AAAAGAAGCC TGTATAGTCA CTTGCACGAG 1740 

CAGCCTGCTG GATATTATGC GTTACAATGA TGATGGTAAA GTTTTTCTTG AGCTCAAACA 1800 

TOGTCTCTTC TAGTTGCATG GTCGCAATCG GATCCAAGGC TGAGGCTGGC TCATCCATTA I860 

AGAGGATATC TGGCTTAACA GAGATGGCAC GAGCGATACA GAGACGTTGT TGCTGACCAC 1920 

CTGATAAGGT GAAGGCTGAC TTGTGGAGAT CGTCTTTAAG CTGATCCCAG AGGGCAGCCT 1980 

GACGAAGGGA GGTTTCTACG ATTTCATCTA GGACTTGCTT ATCCTTAACT CCAGCACGTT 2040 

CATGCGCAAA GGTAATATTA CGGTAAATTG ACTTAGCAAA TGGATTGGGA CGTTGAAAAA 2100 

CCATTCCAAT GTGTTTACGC ATTTCATAAA CGTTGATTTC TGGACGGTTG ACATCAATTC 2160 

CACGATAGAG AATCTGCCCA GTTACTTTAG CAATATCAAT AGTATCATTC ATGCGATTGA 2220 

GACTGCGTAA GTAGGTAGAT TTCCCCGATC CCGACGGGCC AATCAAAGCT GTAATTTTAT 2280 

TTCTTTCAAA TTGCATATCA ATCCCCTTAA TGGATTCATT TTTACCATAG TAAACATGGA 2340 

CATCCTTAGT AGAAAGGGCT ACTTTTTCTT CAGGAAAGGT AAGGATATGC TTCTCATCCG 2400 

AGTTATATGT TGACATGGCT TCTCCTTTAG GCAGCGGTTA ATTTCTTGTG TAGATAGCTT 2460 

CCGAACTTAC GAGCTCCAAA GTTAAAAATC AGGATAAAGA TCAGGAGCAC AGCGGCAGAA 2520 

CCTGCTGATA CAATGGTTCC ATCTGGAATA GTGCCTTCAC TATTGACTTT CCAGATATGG 2580 

ACAGCCAAGG TTTCTGCTTG ACGGAAGATA GAGATGGGGC TAGTCACACT GAGGATATTG 2640 

CAGTTAGACC AGTCAAGAGC TGGCGCCGAT TGCCCTGCTG TATAGATCAG AGCTGCAGCT 2700 

TCGCCAAAGA TACGACCAGA TGCCAAGACG ACACCCGTTA CAATACCTGG AAGCGCTTCC 2760 

"GGAATAACAA CATGAACCAC TGTCTCCCAG CGAGAAATCC CAAGAGCCAG ACCAGCCTCA 2820 

CGTTGGGTAT GGTGAACGTG TTTCAAACTA TCCTCTACAT TACGCGTCAT CTGAGGCAAG 2880 

TTAAAGACTG TCAAGGCCAA GGCACCTGAA ATGATTGAAA ATCCATACTC AAACTGGACT 2940 

ACAAAGATCA AGTAACCAAA GAGACCCACC ACCACTGATG GTAAAGAGGA CAAAATTTCA 3000 

ATACAAGTCC GCACAAAGTT GGTAACAGGA CCTTTTTTAG CATATTCAGC CAAGTAAATG 3060 

CCAGCTCCCA TAGAAAGAGG TACAGAAATA ATCAAGGTAA TGACCAATAG GAAAAAGGAA 3120 

TTGTAAAGCT GAATGCCAAT CCCACCACCT GCTTGAAAAG CAGAAGACCT TCCAGTCAAG 3180 
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AAAGACCAAG AGATATGGGG CAAGCCCCGA ACCAAGATAT AGAGAATCAA GGAAGCCAAG 3240 

ATTGTCACAA TGATGCTAGC AATCGTATAG AGGACAGCTG TTGCAAGTTT ATCTAATTTC 3300 

TTAGCGCGCA TAATTTTTCT TTCCTCTTTC TTTCGTAATC AATTTAATCA CACTGTTAAA 3360 

AACTAAGCTC ATCAAGAGCA GTACCAAGGC CAGTGACCAG AGAACATTAT TATTTACAGT 3420 

TCCCATGACA GTGTTCCCAA TTCCCATAGT TAATATAGAA GTTAAAGTTG CAGCTGGTGT 3480 

GGTCAAGGAA GTTGGGATAA CAGCTGAGTT TCCGACAACC ATCTGGATAG CTAGAGCCTC 3540 

ACCAAAGGCA CGCGCCATCC CAAAGACCAC TGCAGTGAAA ATACCAGAAC GGGCCGCCTT 3600 

CAAGATCACA CGCCAGATAG TCTGCCAGCG AGTGGCTCCC ATAGCGAAAC TGGCTTCACG 3660 

ATAATAACGA GGAACCGCAC GCAAGCTATC CGTTGTCATA AAGGTTACGG TCGGCAAAAT 3720 

CATGACAAAG AGGACGGAAA TCCCTGACAA AATCCCAAAA CCAGTCCCAC CAAAGACACT 3780 

GCGAACAAAG GGAACGACGA CTTGCAAGCC AATAAATCCG TACACTACTG AAGGAATCCC 3 840 

AACCAGGAGT TCAATAGCTG GTTGCAAAAT CTTCGCCCCT TTTGGTGATA CTTCGGTCAT 3900 

AAAAACTGCT GCACCAATAG CAAAGGGTGT TGCGATAAGG GCTGAGAGAA TGGTAACGAT 3960 

AAAGGAACCC AAAATCATAG GAAGGGCACC AAATTCTTTA CTAGAAGGAT TCCAAGTTCC 4020 

TCCCAAAAGA AAGTCAAAGA TATTCACACC ATTGACAAAG AAGGTCGACA AGCCTTTTTG 4080 

CGCTACGAAA ACCAAAATCA TGGCCACAAG GATGACTATC AAAGAAAGAC AGGCAAAGGT 4140 

CAAACCTTTT CCTAATTTCT CCAGACGAGA ATTCTTTGAT GGAAGCAACA TTTTCTTAGC 4200 

TAATTCTTCT TGATTCATTA TTGTCTCCCT TCCAACACTG TCACAGTTCC GGGAGCATCT 4260 

TTTTCAACCT TCATTTCCTT AATCGGAATA TACTTCAATC CTTTGACAAT CCCTTCTTGG 4320 

GTCTCATCCG AGAGAACAAA ATTGAGAAAT TCTGCAGCCA ACTCATTGGG CTGCCCCAAT 4380 

GTATACATAT GCTCATAAGA CCACAAGGGC CAATTATTGC TACTTATATT TTGTGGACTT 4440 

AAGTCATAGC CATTCAACTT CATGCTTTTG ACCGAATCAT CTATATAGGT AAGAGATAAA 4500 

TAAGAGATAG CTCCTGGACT TTTTGATACG ATTGATTTTA CCGCTCCATT TGAATCCTGC 4 560 

TCCTGACTTT GCATGGCAGA CTGACCTTCC ATAATGACAG TATCAAAGGT AGCACGAGAG 4620 

CCAGAGCCGG CTGCCCGATT GATAACAGAG ATGGGTAAGT CCTTACCACC AACCTCTTTC 4680 

CAATTGGTTA CCTCACCTAT GAAGATTTGA CGAAGTTGCT CTGTCGTTAG GTTATCAACA 4740 

TCAACCTCCT TATTGACAAT CAGAGCCAAG CCAGCTACCG CGACCTTGTG GTCAACAAGA 4800 

GCAGAAGCAT CAATTCCGTC TTTTTCCTCA GCAAATACAT CTGAGTTTCC TATATCAACT 4860 

GCCCCAGACT GAACCTGGGA CAAGCCTGTA CCAGAACCTC CCCCTTGGAC ATTGACCGTT 4920 

TTTCCAACAT GGATCGTGCC AAATTCATCT GCCGCTACTT CAACCAAGGG. TTGCAAGGCA 4980 
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GTTGAGCCAA CAGCCGTTAT GGATTCTCCA CGATCAATCC AGCTAGCACA GCCTACTAAA 5040 

CAAGCCGTCA GCCAAAAAGC GATAAGAGAC AGAGCAAGCT TTTTTCTTTT TTTCACTGTT 5100 

TTTCTCCTCG AAAATAATTA TGAATACTGT GAATTTTTTA AGTAGTTCTT TATGAGTTGA 5160 

CGCATGAATT CTTACCAAAT TTCTGCGCAA TTGATTATTT ATATAATATA GGCTATATTA 5220 

CTCTTTCCTA ACCTCCTTTT TTCATATGTG GATAAAATCT CTTGTCTATC CCTTCCCCCA 5280 

TTGTCACCCA TTATAGTCAT TTCGTGTCTC TTTTTCCCCT TTTTAATGCA AGGGAAATTA 5340 

CTCTCCTTAG ATGATAATCC AAAAGCTAGA AAGGTATCTC AAACCTCTCT ACTCTCCCAG 5400 

ACTAGTTTAC AACTAAAAGG AAAAGATTCT ATTTTATGAG AAATGTAGTT TACAAGCGGT 5460 

AAGAACGCTA ATAACTAAAC TTCTTGTACT CTTTGAAAAT CTCTTCAAAC CAGTGTTTTG 5520 

AGCTATCTAT GGCTAGCTTC CTAGTTTGCT CTTTGATTTT CATTGAGTAG TAAAACTACA 5580 

TGTAATGGCA ATCAAGATAT CAAGAATCAT CCTACTAAAA AAATCCATAC TTTCACTATA 5640 

ACATAGAATA AGATATTTGA CTAGCATTTT CATTTGAATC TGAGGCCTTT TGGAAAATAA 5700 

TTTTTCAAAA CATTTCCAGT AACCTTTGCA AAGCCCAAGC CATTGCCTTT AACCAAAACT 5760 

TGGTACCAAC CATTTGGCAG ACTTTCTGCC AGCTGAACGG TTTCTCCAGC CGCATACTTG 5820 

ACAAACGCTT CTTGGCCAAT TTCAACCGAC TGTTCGACCT GACTCGGTTT CAAGGCTAAA 5880 

CCAAGAGCGA AACTGGGCTC AAAGCGTTTC TTCTTAAAAG TACCCAGATG CAGTCCATTG 5940 

CGAGCAATCT TGAGCTTCCA TAAATCTGGC AAAAGTTCTG GCAAGAGATA AAGCTGGTCT 6000 

CCAAAAATCT GCAAGATACC CGGTAGATTG ACCTTCAAAT GGTTTTGGGC AAATTCCTGC 6060 

CACAAGGCAA CTTGTTCACG GCTGAGGTTA CTCTTACTTG CCTTAAATTT AGGAGCTGGA 6120 

TTGTTACCCT TAAACTGTAG ATGGGCAACA AACTGACCCT CTCCCTTAAA CTGATGAGGA 6180 

TACATCCGAG CCGTTTCTGG CAGGTCAATA CCAGCTACCA TTCCATTGAT ATGCTCTACT 6240 

GGCAACAAGT CAAAATCATA CTCTTCCAGC AACCAATTGA CAATCTCTTC GTTTTCCTCG 6300 

GGTGCCCAGG TACAGGTCGA ATAAACCAGA TGACCACCTT CAGCTAACAT GGTCACTGCA 6360 

TCCTCCAGAA TTTCTCTTTG CAAGCTAGCA CATTGACTCG GATAATCTAA GCTCCAATAG 6420 

TCCATAGCAT CAGGTTGCTT ACGAAACATT CCTTCACCAG AGCAAGGGGC ATCAAGAACG 6480 

ATTAAGTCAA AATAGCCTTT AAAGACCTTG ACCAAGCGGT CGGCAGATTC ATTGGTCACC 6540 

ACGACATTTG TCGCTCCAAA ACGCTCCATG TTTTCAACCA AAATCTTAGC CCGTTTGCTT 6600 

GAAATTTCAT TGGAAnCAAG TAGCCCCTCC CCTGCTAGAT AGGGTGCCAG TTGAGTTGAT 6660 

TTGCCCCCCG GTGCAGCAGC CAAGTCCAAG ACCTTCATAC CAGGACTGGG TTGGGCTACT 6720 
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TGAGCCACCA TTTGAGCAGC AGGTTCTTGC GAATAAACTA AACCTGTAGC ATGCTCAGGC 6780 
GATTTCCCTG AAACCTTCCC ATAGTGGCCC CAAGGGGTTT GAGTAATGGC ATCAGAAAAG 6840 
GAAAGTTGCT CTTCTTTTAA GGGATTGACC CGAAAGGCCG AAACCGCTTC CTCCTCAAAA 6900 
GAGGCAAGAA AATCTCTTGC CTCATCTCCT AGTATCTCTT TATATTTTTC AACAAATCCT 6960 
TCTGGAAATT GCATTTAAGT TCTTTTCCTT TCGTAAATAT AGGACTGAAT TTCCTCCTGC 7020 
ATCTCAAGAG GCACCATCAT GACCGGCTGT CTGGTTTGAA AATCAGGAGC TTCACCAAAA 7080 
AGGGTCACAA CCCGATAGCC CAGACTTTCC CCTAAAATAC TAGCTGCGGC ATAATCCCAT 7140 
GGTTGCAGAT AAGTGAGATA GGTCAACAAA CGCCCTGACA AAATCTTGGC AAAACTAATG 7200 
GCCGCACTTC CATAGACACG AACACCAAGA ACCGCTCGGC TCAAATCAGC CAGCCCCCAT 7260 
TCATTGGTTT CCAGCATACC ACTATTCCCT GCAATGAGAA AATCTCCAAG TGGTTTAGTT 7320 
TTAAAAGGAG CTAGGGACCT ATCATTTAGA CAAACTGGAA ATTCCCCACC ACCGTGGTAA 7380 
CAATCCCCTT TGACCACATC ATAAATCAGA CCAAACTGTC CCTGACCATT TTCAAAATAA 7440 
GCCATCATAA CAGCAAAATC TTCCTGCTGG GCTACAAAAT TATTGGTACC ATCAATGGGA 
TCAATGACCC AAACCTTGCC CTCTTGAACC GAGGCTCGCA GACAACCTTC TTCAGCACAA 
ATCTTATCCT CAGGATAACG GGACAAAATC TCACCAACCA AGAGTTCCTG AACTTCTTTG 
TCCAGTCTGG TCACCAAATC TGTTGGAGAG GACTTGGTTT CAACACGCAA GTCTTCCTGC 
ATATGGTCAA GAATGTACTG ACCTGCTTTC TTAACAAGCT CTTTAGCAAA TTCAAATTTA 
CTTTCCAAGA GAAATCTTTC CTTCCCCTTT TTCTTTGGGG 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4820 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GTAATGATAT AGGAACACCA GGTGACCTGA TGGGACGTCG TAAGCCTATG AACTACTAGC 60 

TGCTAAAGGC TTTAAAGATG GTATGGTACC ATATATCTCA AACCAATACG AAGAAGAAGC 120 

CAAACAAAAG GGCAAGACAA TCAATCTCTA CGGTAAAACA AGAGGTTTGG TTACAGATGA 180 

CTTGGTTTTG GAAAAGGTAT TTAATAACCA ATATCATACT TGGAGTGAGT TTAAGAAAGC 240 

TATGTATCAA GAACGACAAG ATCAGTTTGA TAGATTGAAC AAAGTTACTT TTAATGATAC 300 

AACACAGCCT TGGCAAACAT TTGCCAAGAA AACTACAAGC AGTGTAGATG AATTACAGAA 360 



7500 
7560 
7620 
7680 
7740 
7780 
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ATTAATGGAC GTTGCTGTTC GTAAGGATGC AGAACACAAT TACTACCATT GGAATAACTA 420 

CAATCCAGAC ATAGATAGTG AAGTCCACAA GCTCAAGAGA GCAATCTTTA AAGCCTATCT 480 

TGACCAAACA AATGATTTTA GAAGTTCAAT TTTTGAGAAT AAAAAATAGT GTCTACTATT 540 

AGGAAATAAA GTTTAAAAAG GTGATGAAGA ACAAACCAAG ATTCAAGCAG GAATTCCTAC 600 

TGATAATGAA GTAAGTTATG ATCTTATTTA TCAGCAGGAA ACTCTTCCTG CAACAGGTTC 660 

ATCAACTTCT GAGCTTACAG CTTTAGGCCT ATTAGCTGTT GGTAGTTTAG TTCTTTTGGT 720 

TCATAATATG ACGGGAACAG TTTTTTGCTC CCTCTGAAAA GTCATCATTT GATGGCTTTT 780 

TTCTATATAG GGTAAAAGAT AGGGTAAAAG GCTATCATCG GACAAAATAA AGAAGGCATG 840 

ATATAATATA AAGTAGATTT CTATGTCATA AAACAAGAAC TGTTTGGACA TCATTCATTT 900 

GAAAACTCTC TATGTTCAAA CAATAGTAAA ATAAAATAGG GGATCTAAAT CCTTGCTATG 960 

AAAGGAAAAA ACTCAATGGC TACTATTCAA TGGTTTCCTG GTCACATGTC TAAAGCTCGT 1020 

CGACAGGTGC AGGAGAATTT AAAATTTGTT GATTTTGTGA CGATTTTAGT AGATGCACGC 1080 

TTGCCTCTAT CTAGTCAAAA TCCTATGTTG ACCAAGATTG TTGGTGATAA ACCAAAACTC 1140 

TTGATTTTAA ACAAGGCCGA CTTGGCTGAT CCAGCAATGA CCAAGGAATG GCGTCAGTAT 1200 

TTTGAATCAC AAGGAATGCA GACGCTAGCT ATCAACTCCA AAGAGCAAGT GACTGTAAAA 1260 

GTTGTAACAG ATGCGGCCAA GAAGCTCATG GCTGATAAGA TTGCTCGCCA GAAAGAACGT 1320 

GGGATTCAGA TTGAAACCTT GCGTACTATG ATTATCGGGA TTCCAAACGC TGGTAAATCA 1380 

ACTCTGATGA ACCGTTTGGC TGGTAAAAAG ATTGCTGTTG TTGGAAACAA GCCAGGGGTC 1440 

ACAAAAGGTC AACAATGGCT TAAAACCAAT AAAGACCTGG AAATCTTGGA TACACCGGGG 1500 

ATTCTCTGGC CTAAGTTTGA GGATGAAACT GTTGCACTTA AGTTGGCATT GACTGGAGCT 1560 

ATCAAAGACC AGTTGCTTCC TATGGATGAG GTTACCATTT TTGGTATCAA TTATTTCAAA 1620 

GAACATTATC CAGAAAAGCT GGCTGAACGC TTCAAACAAA TGAAAATTGA AGAAGAAGCG 1680 

CCTGTGATTA TTATGGATAT GACCCGCGCC CTCGGTTTCC GTGATGACTA TGACCGTTTT 1740 

TACAGTCTCT TCGTGAAGGA AGTCCGTGAT GGCAAACTCG GTAACTATAC CTTAGATACA 1800 

TTGGAAGACC TCGATGGCAA CGATTAAAGA AATCAAAGAA TTCCTTGTGA CAGTCAAGGA 1860 

GTTAGAAAGC CCTATTTTTT TAGAGCTTGA AAAGGATAAT CGCTCAGGAG TTCAAAAGGA 1920 

AATCAGCAAG CGTAAAAGAG CCATTCAAGC TGAATTAGAT GAAAATTTGC GCTTGGAATC 1980 

CATGCTTTCT TATGAAAAAG AACTTTATAA GCAAGGATTG ACCTTAATTG CAGGTATTGA 2040 

TGAGGTTGGT CGTGGTCCTC TTGCTGGTCC TGTAGTCGCT GCGGCCGTTA TTTTATCTAA 2100 
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AAATTGTAAG ATTAAAGGTC TCAACGACAG CAAGAAAATT CCTAAAAAGA AACATCTGGA 2160 

GATTTTCCAA GCCGTTCAAG ACCAAGCCTT GTCGATTGGA ATTGGTATCA TAGATAATCA 2220 

GGTCATCGAC CAAGTCAACA TCTATGAAGC AACCAAACTA GCCATGCAAG AAGCAATCTC 2280 

CCAGCTCAGC CCTCAACCAG AGCACCTTTT GATTGATGCC ATGAAACTGG ACTTGCCCAT 2340 

TTCACAAACC TCCATTATCA AAGGAGATGC CAACTCCCTC TCTATCGCAG CAGCATCTAT 2400 

AGTAGCCAAG GTAACACGTG ATGAATTGCT GAAAGAATAC GATCAGCAGT TCCCTGGCTA 2460 

TGATTTCGCT ACTAATGCAG GATATGGCAC AGCTAAACAT CTGGAAGGCC TCACAAAACT 2520 

AGGAGTTACC CCAATTCACC GAACCAGCTT TGAACCCGTT AAATCACTGG TTTTAGGTAA 2580 

AAAAGAAAGT TAATTGAAAG GAAATAACAT GGAGGAACAG TCGGAAATAG TCCGTTCTAA 2640 

GAAAGAATTC GCCTTTGCAT CCAGCACTAT ACTATCCCAA GTTGGTCGAG GAATCATTGT 2700 

CGGCCTCATC GTTGGAATTA TCGTCGGATC CTTTCGTTTC TTAATTGAAA AGGGCTTCCA 2760 

CCTGATACAA GGAGTTTATC AAGATCAAGG GTACTTAGTG CGCAATCTTT TTGTACTGGT 2820 

TTTGTTTTAT ATACTCATCT GTTGGCTCAG TGCCAAACTA ACACGGTCAG AAAAAGATAT 2880 

TAAAGGCTCA GGAATTCCTC AAGTCGAAGC CGAACTGAAA GGCCTCATGT CCCTCAACTG 2940 

GTGGGGCATT CTTTGGAAAA AATATGTGCT AGGTATTCTT GCTATTGCCA GTGGACTCAT 3000 

GCTGGGTCGA GAGGGACCCA GCATTCAACT TGGAGCAGTT GGTGGTAAAG GAATTGCCAA 3060 

GTGGCTCAAA TCCAGTCCAG TAGAGGAACG TTCCTTGATT GCCAGTGGAG CTGCAGCAGG 3120 

TTTAGCCGCA GCCTTTAATG CTCCTATTGC AGCACTTCTC TTTGTTGTAG AAGAAGTCTA 3180 

TCACCATTTT TCGCGCTTTT TCTGGGTCTC AACTCTAGCA GCCAGCATCG TAGCAAACTT 3240 

TGTGTCTCTA CTCATGTTCG GTTTGACACC AGTATTGGAT ATGCCAGATA ACATTCCTCC 3300 

CATGACCCTA GATCAGTATT GGATATATCT CGTCATGGGA ATTTTCCTTG GATTTTCAGG 3360 

TTTTCTCTAT GAGAAAGCTG TATTAAACGT TGGAAGAGTT TATGACTTGA TTGGTCAAAA 3420 

. AATCCATTTG GATAGGGCTT ATTATCCCAT CTTGGCTTTT ATCCTTATCA TACCAGTCGG 3480 

AATCTTCTTA CCTCAAATCA TTGGTGGCGG AAATCAGCTT GTCCTTTCTT TAACTGAACA 3540 

AAATTTTAGT TTCCAAGTTT TATTAGCTTA CTTTTTAATC CGCTTTATTT GGAGTATGAT 3600 

TAGCTATGGA AGTGGACTGC CAGGAGGAAT TTTCCTCCCC ATTTTAGCTC TTGGTTCTTT 3660 

GCTTGGTGCC TTAGTTGGTG TTATCTGTGT CAATCTTGGA CTTGTCAGTC AAGAGCAATT 3720 

CCCTATATTT GTCATTCTAG GAATGAGTGG CTATTTTGGA GCCATATCAA* AAGCTCCCTT 3780 

AACCGCTATG ATCCTCGTAA CTGAGATGGT AGGAGATATT CGCAACCTTA TGCCACTTGG 3840 

TCTTGTCACT CTTGTTTCTT ATATTATCAT GGATTTGCTC AAAGGTACGC CAGTCTATGA 3900 
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AGCCATGCTG GAAAAAATGC TTCCAGAAGA AGTATCTAGC GAAGGAGAAG TTACACTTAT 3960 

CGAAATACCA GTTTCTGATA AAATTGCTGG GAAACAAGTT CATGAACTCA ACTTACCACA 4020 

CAACGTCCTC ATCACAACTC AAGTCCATAA TGGCAAGAGC CAAACAGTTA ACGGCTCAAC 4080 

CAGAATGTAT CTGGGTGATA TGATTCACCT GGTTATTCCA AAAAGTGAAA TTGGAAAAGT 4140 

CAAAGATTTG TTGTTGTAGT ATGAGTATTT ACATAATTTA TGTTATGTAA ATGATCAGTT 4200 

TGATTTATTT AGAAAACCGA TTCTCAGGAA TGAGATCGGT TATTTTTTAC TGATGAGGAA 4260 

TTTTACATAT AAATAATTGA ACTTTATTAA AAATAAGACT ATAATTAAGT TAGAAATGAT 4320 

AAAGTATAAA GCTAGAAAGG AGTTTACTGT ATCAAATCTG TACAGTAAGA TTAAAATCAT 4380 

GAAAAAGAAA ACAATAGCAA TTATATAGAG AAATGAAATA GAAATAGGAT AAAACAATCA 4440 

GGACAATCAA ATCAATTTCT AGCAATGTTT TAGAAGTCCA GATGTACTAT TCTAGTTTCA 4500 

ATCTATTATA CAATGTGTTT TGTATCTCAT AGCTCCTTAT ATAGCTCTTC AGTT ATGT AG 4560 

TATTAACAGA AGTTTAGTGG GTGAGATTTT TATTATTTTC CTTATTCTGT TTTGTTTGTA 4620 

GGTCTAAGTC TTTTTATCAC TTTGAAAAAC TCCTATAACA TCTTTCCGAA AAACTATAAT 4680 

TTTCTTGAAA AATATACAAG TCTATGCTAT ACTACTAGTA TACTTACTTA TGGAGAAAAT 4740 

ACATGAAACG TGAGATTTTA CTGGAACGAA TCGACAAACT AAAACAACTC ATGCCCTGGT 4800 

AAGTTCTGGA ATACTACCAA 4820 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21338 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CTACGACATG ATGATTAACA GTCATGCGCT ACTACCAACT GAGCTATGGC GGATAAAATA 60 

GTCCGTACGG GATTCGAACC CGTGTTACCG CCGTGAAAAG GCGGTGTCTT AACCCCTTGA 120 
CCAACGGACC TTCTATCTGT AGCAGATATA ACCATTATAT CAATTTCTTG CTAATTGTCA " 180 

ATCACTTTTG AGATTTTTTC TCTAAAATAT CTTTTAATTT TCTAATTTTT AATCTTGAAA 240 

TAGGACAACG ATGGTCTTCA TAGAAAACAA TTTCTAAGTT TTTTCGATCA ATTTCTCTGA 300 

TATTACCTAT ATTTACCAAA AATGACTTGT GAGGAGAATA AAATCGCTGA GTATGTTTGT 360 

CCTTTTCCTG AATATCTGTC ATGGTACCAT AAAACTCTTT TGCAAAATTC TTACCAATAA 420 
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TGCGCAATTT ATGAGATACC CCTGTTGTTT CAATATACAA AATATCATGG TAAGGAATTT 
TTAAATCATT TCCCTTGTAA TTGTAGTCGA AATAATCTAC AACATCTTCA TTTTCAAGTA 
ACATACTCTT CGTGTAGAAG ATATTTTGCT CAATTCTCTT CTTAAACATC TCATCATTGA 
TATCCTTATC AACAAAATCT AGGGCTGATA CCTGGTATTT ATAGGTTAGA GTCGCAAACT 
CTGATCGACT AGTGATAAAG ACGATAATAG CGTAAGGATT GTAATGACGA ATGAGCTGAG 
CCACTTCAAA TCCCTTTTTC TCAATTCCAT GAATATCGAT ATCTAGGAAA TAAAGCTGAT 
TTACTTCATC ATTTTCAATG TATTCTTCAA ATTCACGGAC TTTTCCCGTT GTCTTGTATG 
ATATTGGAAT ATTCGATTCT TTCGAAATTT CATCCAATAT TCTCTCTAGT CTCACTTGAT 
GTTCAATAAC ATCTTCTAAA ATTAAAACTT TCATTCAAAT TCCCTCTTAA ATCTAATGAT 
TTGTCTAAAT GTACTGCCTT CCATCTCTGT TTCTAAAATA ATATTGTTGT ACTTATCTAG 
TAGTTCTTTC ACATTATTTA ATCCGACTCC GCGATTTCTT CCCTTAGTGG AGAATCCTAA 
GGCAAATAGA TCTCCTGAAG GAGTCATCGT CATTTTACAT GAATTCTGAA TCACAATAAC 
TGTTTCAGTT TCCATCTTAA TAACTGCTAC TTCCATCTGC TTTTTATAGC TATCAGCCGA 
TCCTTCGACA GCATTATTCA ATAAAACGCT CATGATACGA ACCAAATCCA ATAGTTCAAT 
TGGAAGCTTG GTAATCGTAT CTTTTACTTC CAGTGTAAAC TCTACACCAT TATTTCGAGC 
ATAGACAATT GACTGAGCAA CCAAACTTCG TAAAGCTGAG TCTTCTATGT TGTTCAAATC 
AAAGTAAGTG TACTTATCTG AACGCAATTT ATGATTTGCT TTGACTAAAA CTTCATTGTA 
AATTCTGTCA ATTTCCTGTA AATTACCACT GTCAATTGCC ATCTGCATGC TGACAAGCAT 
TCCAGCATAA TCATGTCGAA AACCACGGAT TTCATTATAC AGACCAACAA TTTCATCTGT 
GTAATTCTGT AAATGTTTCT GTTCAAATTT CTTCTGCTTC AAAGCAATCT CTTTCTCCAT 
TTGAACTTTA TGAGAATTCA TTGCAAAGAA GGTCAAAAGG AGAGAGATAA AGACAATAGA 
TGACAAAATA CTTCCAAAAC TATTCAAATG TTTAATCGTA CTTACCATAT CTGAAACGAA 
AGATACAATA TGTAGCAATA GTAAAGCAAA AAATACTTTT TTCAAGAAAG GATAAAGGTA 
GTCCTTGTCA AAATAGGCTA GTTCCAAATG GAAATAGTAA ATGATTTTTA ATGTAACAAA 
ATAGGTTAAC ACCGTCACAA CGAAAAAGAA TGGGAAATGA TATTGTAAAA CAAAATTATC 
TCCTGTTATA GAGGAGAAAA TTACGGACAG AAAGTTATGA GTGCTCTCAT ATAAAAGAGA 
TAGTAGTAAA CTTAGGAATA GTCCTCTATC CCTCTCATAC TGTTTCATCC ATCGAAAATA 
GGAATATAAG CCCAAAGGAA ATAAAAATCT TTCAATCCCT ATTTTATCTA AATATAGAAG 
ATAAAAGGAA AATTCAAGTA CTATTTCAGT TAGTAATGTA TAAGCACCAA AAACGTATAA 
TTCTTTTCTA TTTATTCGAC CTTTACAAAT TAAACGGTAA CTGTGACTAA TAATTAAAAA 
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ATGAACAATA ACTGTCCCAA ATCCAAGTAA ATCCATTACT CTTTCTCCTT ATTTCATTAC 2280 

TTTTTTCGTA GGAAAAGAAA ATCAAGGATG ATTCTTGAAA TCCTCATCTC CCCACCTTTA 2340 

ATCTTTTGTA AGTCTTTTTC CTTCAAAGCT ACAAACTGTT CCAATTTAAC TGTGTTTTTC 2400 

ATAATAAAAT CTCCTAAAAT GTTTTTTCTT GTAAGCTAAC TTACAAAAAC CATTATACAA 2460 

AATGGAATTT CGTTTTAGAT AAAATTCTCT CAACTGTCAT TTTTTTCTCC CAAAGTGTAC 2520 

TTTTTTAAGA AAAAAGCCGG GAAAATTCCC AGCTTTGCTA TTATATTGAT CCCAGCAGGA 2580 

TTCGAACCTG CGACCGTTCG CTTAGAAGGC GAATGCTCTA TCCAGCTGAG CTATGAGACC 2640 

TAATACAATT ATTCTACCAA AAATTCAATT AAAAGTCAAT TTTGTATTTA TGGTAGGGGA 2700 

ATCCCTGCTG AATCGTAAAA GCGCGATAGA TTTGTTCAAC AAGAACTAGT CTCATTAACT 2760 

GATGGGGTAA GGTTAGGCGA CCAAAACTGA CAGAAAGATT GGCTCTATTT TTTACAGATG 2820 

ATGATAATCC TAAACTTCCC CCAATAATAA AAGTAAGAGT AGAAAATCCT TTTATAGAAG 2880 

TTTCTTCTAA CTGCTTACTA AATTCTTCTG AGAAGAAAGT TTTCCCTTCA ATGGCTAACA 2940 

CAATAACGAA ATCACGGTCA GCAATTTTTG ATAAAATTCT CTGACCTTCT ATTTCTAAAA 3000 

TCTTTTGATT TTCTGATTCA CTGGCCTTAT CTGGTGTTTT TTCATCTGAT AACTCAATGA 3060 

TTTCAAACTT AGCAAATCTA GAAATTCGTT TTGAATACTC TGCGATACCA TCTTTTAAAT 3120 

ACTTTTCTTT CAGTTTCCCA ACTGTTACAA CTTTAATTTT CATGACTCTA TTCTAACATA 3180 

TTCTCTATTT TTTCACATCT TATTCACAAA ATAAAAAATA GATTTCAATT AAGAAAATCA 3240 

CAATTTCAAA AGAGTTATCC ACAGTTTGTG TAAAACTTTT GTGTTTAAGT TATAATTAAG 3300 

CTAGTCAGTT TATACTTTCA GTAATTCAAA CATATGGAGG CAAATATGAA ACATCTAAAA 3360 

ACATTTTACA AAAAATGGTT TCAATTATTA GTCGTTATCG TCATTAGCTT TTTTAGTGGA 3420 

GCCTTGGGTA GTTTTTCAAT AACTCAACTA ACTCAAAAAA GTAGTGTAAA CAACTCTAAC 3480 

AACAATAGTA CTATTACACA AACTGCCTAT AAGAACGAAA ATTCAACAAC ACAGGCTGTT 3540 

AACAAAGTAA AAGATGCTGT TGTTTCTGTT ATTACTTATT CGGCAAACAG ACAAAATAGC 3600 

GTATTTGGCA ATGATGATAC TGACACAGAT TCTCAGCGAA TCTCTAGTGA AGGATCTGGA 3660 

GTTATTTATA AAAAGAATGA TAAAGAAGCT TACATCGTCA CCAACAATCA CGTTATTAAT 3720 

GGCGCCAgCA AAGTAGATAT TCGATTGTCA GATGGGACTA AAGTACCTGG AGAAATTGTC 3780 

GGAGCTGACA CTTTCTCTGA TATTGCTGTC GTCAAAATCT CTTCAGAAAA AGTGACAACA 3840 

GTAGCTGAGT TTGGTGATTC TAGTAAGTTA ACTGTAGGAG AAACTGCTAT TGCCATCGGT 3900 

AGCCCGTTAG GTTCTGAATA TGCAAATACT GTCACTCAAG GTATCGTATC CAGTCTCAAT 3960 
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AGAAATGTAT CCTTAAAATC GGAAGATGGA CAAGCTATTT CTACAAAAGC CATCCAAACT 4020 

GATACTGCTA TTAACCCAGG TAACTCTGGC GGCCCACTGA TCAATATTCA AGGGCAGGTT 4080 

ATCGGAATTA CCTCAAGTAA AATTGCTACA AATGGAGGAA CATCTGTAGA AGGTCTTGGT 4140 

TTCGCAATTC CTGCAAATGA TGCTATCAAT ATTATTGAAC AGTTAGAAAA AAACGGAAAA 4200 

GTGACGCGTC CAGCTTTGGG AATCCAGATG GTTAATTTAT CTAATGTGAG TACAAGCGAC 4260 

ATCAGAAGAC TCAATATTCC AAGTAATGTT ACATCTGGTG TAATTGTTCG TTCGGTACAA 4320 

AGTAATATGC CTGCCAATGG TCACCTTGAA AAATACGATG TAATTACAAA AGTAGATGAC 4380 

AAAGAGATTG CTTCATCAAC AGACTTACAA AGTGCTCTTT ACAACCATTC TATCGGAGAC 4440 

ACCATTAAGA TAACCTACTA TCGTAACGGG AAAGAAGAAA CTACCTCTAT CAAACTTAAC 4500 

AAGAGTTCAG GTGATTTAGA ATCTTAATTG ACATCTATGT AAAGAAAGCT TTACATAAGA 4560 

GAAAAGATGT GTTAGTGTAG AATCATGGAA AAATTTGAAA TGATTTCTAT CACAGATATA 4620 

CAAAAAAATC CCTATCAACC CCGAAAAGAA TTTGATAGAG AAAAACTAGA TGAACTAGCA 4680 

CAGTCTATCA AAGAAAATGG GGTCATTCAA CCGATTATTG TTCGTCAATC TCCTGTTATT 4740 

GGTTATGAAA TCcTTGCAGG AGAGAGACGC TATCGGGCTT CACTTTTAGC TGGTCTACGG 4800 

TCTATCCCAG CTGTTGTTAA ACAGATTTCA GACCAAGAGA TGATGGTCCA GTCCATTATT 4860 

GAAAATTTAC AGAGAGAAAA TTTAAACCCA ATAGAAGAAG CACGCGCCTA TGAATCTCTC 4 920 

GTAGAGAAAG GATTCACCCA TGCTGAAATT GCAGATAAGA TGGGCAAGTC TCGTCCATAT 4980 

ATCAGCAACT CCATTCGTTT ACTTTCCTTG CCAGAACAGA TTCTTTCAGA AGTAGAAAAT 5040 

GGCAAACTAT CACAAGCCCA TGCGCGTTCC CTAGTTGGGT TAAATAAGGA ACAACAAGAC 5100 

TATTTCTTTC AACGGATTAT AGAAGAAGAT ATTTCTGTAA GGAAATTAGA AGCTCTTCTG 5160 

ACAGAGAAAA AACAAAAGAA ACAGCAAAAA ACTAATCATT TCATACAAAA TGAAGAAAAA 5220 

CAGTTAAGAA AACTACTCGG ATTAGATGTA GAAATTAAAC TATCTAAAAA AGACAGTGGA 5280 

AAAATCATTA TTTCTTTTTC AAATCAAGAA GAATATAGTA GAATTATCAA CAGCCTGAAA 5340 

TAAGGCTGTT CTTTTATTTT TTTATCTCAC AAGGTTATCC ACTATGTTTT TCGATAAAAA 5400 

GCTTAATAAA TCAATAATTT CTTCTTTTAT CCCCAACCTG TGGATAAAGT TTGGTAACAT 5460 

TGTGGATTAT TTTTCACAGC TTGTGGAAAA TTCTTGCTAT CTATGGTAAA ATATCTCTAG 5520 

TATTAAACTT TTAAATAGTA AAGGAGGAGA AAGGATTGAA AGAAAAACAA TTTTGGAATC 5580 

GTATATTAGA ATTTGCACAA GAAAGACTGA CTCGATCCAT GTATGATTTC TATGCTATTC 5640 

AAGCTGAACT CATCAAGGTA GAGGAAAATG TTGCCACTAT ATTTCTACCT CGCTCTGAAA 5700 

TGGAAATGGT CTGGGAAAAA CAACTAAAAG ATATTATTGT AGTAGCTGGT TTTGAAATTT 5760 
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ATGACGCTGA AATAACTCCC CACTATATTT TCACCAAACC TCAAGATACG ACTAGCTCAC 5820 

AAGTTGAAGA AGCTACAAAT TTAACTCTTT ATAACTATAG TCCAAAGTTA GTATCTATTC 5880 

CTTATTCAGA TACGGGATTA AAAGAAAAGT ATACCTTTGA TAACTTTATT CAAGGGGATG 5940 

GAAATGTTTG GGCTGTATCA GCCGCTTTAG CTGTCTCTGA AGATTTGGCT CTGACCTATA 6000 

ACCCTCTTTT TATCTATGGA GGACCAGGCC TTGGTAAGAC TCACTTATTA AACGCTATTG 6060 

GAAATGAAAT TCTAAAAAAT ATTCCTAATG CGCGTGTTAA ATATATCCCT GCCGAAAGCT 6120 

TTATTAATGA CTTTCTTGAT CACCTAAGAC TTGGGGAAAT GGAAAAGTTT AAAAAGACCT 6180 

ATCGTAGTCT TGATCTTTTG TTAATCGATG ATATCCAGTC ACTCAGCGGA AAAAAAGTCG 6240 

CAACTCAGGA AGAATTTTTC AATACCTTTA ACGCCCTTCA TGACAAGCAA AAACAGATTG 6300 

TCCTAACGAG TGATCGTAGT CCAAAACATC TAGAAGGGCT CGAGGAGAGG CTTGTCACGC 6360 

GTTTTAGTTG GGGATTGACA CAAACTATCA CCCCCCCTGA CTTTGAAACA CGTATTGCCA 6420 

TTTTACAAAG TAAGACGGAA CATTTAGGCT ACAATTTCCA AAGTGATACT CTAGAATACC 6480 

TAGCTGGGCA ATTTGATTCA AATGTTCGAG ATCTTGAGGG AGCCATCAAC GACATCACTT 6540 

TAATTGCCAG AGTAAAAAAA ATCAAGGATA TCACTATTGA TATTGCTGCA GAAGCCATTA 6600 

GAGCCCGCAA ACAAGATGTT AGCCAAATGC TCGTGATCCC AATTGATAAA ATCCAAACTG 6660 

AAGTTGGTAA CTTTTATGGT GTTAGTATCA AAGAAATGAA GGGAAGTAGA CGCCTTCAAA 6720 

ATATTGTTTT GGCCCGTCAA GTAGCCATGT ATTTATCTAG AGAACTAACA GATAATAGTC 6780 

TTCCAAAAAT TGGGAAGGAA TTTGGGGGAA AAGATCATAC CACAGTCATT CATGCCCATG 6840 

CCAAAATAAA ATCTTTGATT GATCAAGACG ATAATTTACG TTTAGAAATT GAATCAATCA 6900 

AAAAGAAAAT CAAATAATTT GTGGATAACT TTTAGTTTTT TATCTTTTTT ATCCACATTT 6960 

TTTAAACAAG CTAAAAAACT TGATATGACT TGTTTAAAGG CTGTTTTCCA CAGATTTCAC 7020 

AGACTCTATT ATTAGTATTA TCTTTCTAAT ACTAAAAATA AATAAAGGAG AATCCATGAT 7080 

TCATTTTTCA ATTAATAAAA ATTTATTTCT ACAAGCATTA AATACTACTA AGAGAGCTAT 7140 

TAGTTCTAAA AATGCCATTC CTATTTTATC AACAGTAAAA ATTGACGTGA CCAATGAAGG 7200 

TATTACTTTA ATTGGTTCAA ATGGTCAAAT TTCAATTGAA AATTTTATTT CTCAAAAAAA 7260 

TGAAGATGCT GGTTTGTTAA TTACTTCTTT AGGTTCGATC CTTCTTGAAG CTT C TT TCTT 7320 

TATCAATGTA GTATCTAGTT TACCTGATGT AACTCTTGAT TTTAAAGAAA TTGAACAAAA 7380 

TCAAATTGTT TTAACCAGTG GCAAATCAGA AATTACCCTA AAAGGAAAAG ATAGGGAACA 7440 

ATATCCACGA ATCCAAGAAA TTTCAGCAAG CACTCCTTTA ATACTTGAAA CAAAATTACT 7500 
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CAAGAAAATT ATTAATGAAA CAGCCTTTGC TGCAAGTACA CAAGAGAGTC GTCCGATTTT 7560 

AACAGGTGTC CACTTCGTAT TGAGTCAACA CAAAGAGTTA AAAACAGTTG CAACAGACTC 7620 

TCATCGCCTA AGCCAGAAAA AATTGACTCT TGAAAAAAAT AGTGATGATT TTGATGTCGT 7680 

AATTCCTAGC CGTTCTCTAC GCGAATTTTC AGCGGTATTT ACAGATGATA TCGAAACTGT 7740 

AGAGATTTTC TTTGCCAATA ACCAAATCCT CTTTAGAAGC GAAAATATTA GCTTCTATAC 7800 

TCGTCTCCTA GAAGGAAACT ATCCTGATAC AGATCGCTTG ATTCCAACAG ACTTTAACAC 7860 

TACTATTACT TTTAATGTGG TAAACTTACG CCAGTCAATG GAGCGTGCCC GTCTTTTATC 7920 

AAGTGCGACT CAAAATGGTA CTGTGAAACT TGAAATTAAG GATGGGGTTG TTAGCGCCCA 7980 

TGTTCACTCT CCAGAAGTTG GTAAAGTAAA CGAAGAAATC GATACTGATC AGGTTACTGG 8040 

TGAAGATTTG ACCATTAGTT TCAACCCAAC TTACTTGATT GATTCTCTTA AAGCTTTAAA 8100 

TAGCGAAAAG GTGACTATTA GCTTTATCTC AGCTGTTCGT CCATTTACTC TTGTGCCAGG 8160 

AGATACTGAC GAAGACTTCA TGCAGCTCAT TACACCAGTT CGTACAAATT AAGTGAAAGA 8220 

GGTTGAGCCT GGCTCGCCTC TTTTATGATA TAATCGAAAA AGAAAAGGAG AGTAGTATGT 8280 

ATCAAGTTGG AAATTTTGTT GAGATGAAAA AATCACACGC TTGTACAATC AAGTCGACTG 8340 

GTAAAAAGGC TAATCGTTGG GAAATTACAC GTGTAGGAGC AGATATCAAA ATAAAATGTA 8400 

GTAATTGTGA GCATGTTGTC ATGATGGGGC GATATGATTT TGAGCGAAAA ATGAATAAAA 8460 

TTATTGACTG AGAACCCTTA GTTAGAGGGT TAGCACTTTA TCCCTTTTTG TGTTATAATA 8520 

TTAGGGATTG AAATGAAAAC GGAGAATGAG AAATATGGCT TTGACAGCAG GTATCGTTGG 8580 

TTTGCCAAAC GTTGGTAAAT CAACACTATT TAATGCAATT ACAAAAGCAG GAGCAGAGGC 8640 

AGCAAACTAC CCATTTGCGA CGATTGATCC AAATGTTGGA ATGGTGGAAG TTCCAGATGA 8700 

ACGCCTACAA AAACTAACTG AAATGATAAC TCCTAAAAAG ACAGTTCCCA CAACATTTGA 8760 

ATTTACAGAT ATTGCAGGGA TTGTAAAAGG AGCTTCAAAA GGAGAGGGGC TAGGGAATAA 8820 

ATTCTTGGCC AATATTCGTG AAGTAGATGC GATTGTTCAC GTAGTTCGTG CTTTTGATGA 8880 

TGAAAATGTA ATGCGCGAGC AAGGACGTGA AGACGCCTTT GTAGATCCAC TTGCAGATAT 8940 

TGATACCATT AATCTGGAAT TGATTCTTGC TGACTTAGAA TCAGTGAACA AACGATATGC 9000 

GCGTGTAGAA AAGATGGCAC GTACGCAAAA AGATAAAGAA TCAGTAGCAG AATTCAATGT 9060 

TCTTCAAAAG ATTAAACCAG TCCTAGAAGA CGGGAAATCA GCTCGTACCA TTGAATTTAC 9120 

AGATGAGGAA CAAAAGGTTG TCAAAGGTCT TTTCCTTTTG ACGACTAAAC CAGTTCTTTA 9180 

TGTAGCTAAT GTGGACGAGG ATGTGGTTTC AGAACCTGAC TCTATCGACT ATGTCAAACA 9240 

AATTCGTGAA TTTGCAGCGA CAGAAAATGC TGAAGTAGTC GTTATTTCTG CGCGTGCTGA 9300 



WO 98/18931 



PCT/US97/19588 



263 

GGAAGAAATT TCTGAATTGA ATGATGAAGA TAAAAAAGAG TTTCTTGAAG CCATTGGTTT 9360 

GACAGAATCA GGTGTAGATA AGTTGACGCG TGCAGCTTAC CACTTGCTTG GATTGGGAAC 9420 

TTACTTCACA GCTGGTGAAA AAGAAGTTCG CGCTTGGACT TTCAAACGTG GTATGAAGGC 9480 

TCCTCAAGCA GCTGGTATTA TCCACTCAGA CTTTGAAAAA GGCTTTATTC GTGCAGTAAC 9540 

CATGTCATAT GAAGATCTAG TGAAATACGG ATCTGAAAAG GCCGTAAAAG AAGGTGGACG 9600 

CTTGCGTGAA GAAGGAAAAG AATATATCGT TCAAGATGGC GATATCATGG AATTCCGCTT 9660 

TAATGTCTAA AAATTAATAA ATGGTGTCAA TTAGGTTGGA AAAAAATTCC AACCCTTTTG 9720 

GCTTTTGAAA GGAAAAATAA ATGACCAAAT TACTTGTAGG CTTGGGAAAT CCAGGGGATA 9780 

AATATTTTGA AACAAAACAC AATGTTGGTT TTATGTTGAT TGATCAACTA GCGAAGAAAC 9840 

AGAATGTCAC TTTTACACAC GATAAGATAT TTCAAGCTGA CCTAGCATCC TTTTTCCTAA 9900 

ATGGAGAAAA AATTTATCTG GTTAAACCAA CGACCTTTAT GAATGAAAGT GGAAAAGCAG 9960 

TTCATGCTTT ATTAACTTAC TATGGTTTGG ATATTGACGA TTTACTTATC ATTTACGATG 10020 

ATCTTGACAT GGAAGTTGGG AAAATTCGTT TAAGAGCAAA AGGCTCAGCA GGTGGTCATA 10080 

ATGGTATCAA GTCTATTATT CAACATATAG GAACTCAGGT CTTTAACCGT GTTAAGATTG 10140 

GAATTGGAAG ACCTAAAAAT GGTATGTCAG TTGTTCATCA TGTTTTGAGT AAGTTPGACA 10200 

GGGATGATTA TATCGGTATT TTACAGTCTG TTGACAAAGT TGACGATTCT GTAAACTACT 10260 

ATTTACAAGA GAAAAATTTT GAGAAAACAA TGCAGAGGTA TAACGGATAA ATGGTGACCT 10320 

TATTAGATTT ATTCTCAGAA AATGATCAGA TTAAAAAATG GCATCAAAAT TTAACAGATA 10380 

AGAAAAGACA ACTAATACTT GGTTTATCAA CATCTACTAA GGCTCTTGCA ATTGCAAGCA 10440 

GTTTAGAAAA AGAAGATAGG ATTGTGTTAT TGACGTCAAC TTATGGAGAA GCAGAAGGAC 10500 

TTGTTAGTGA TCTTATTTCT ATCTTGGGTG AGGAACTCGT CTATCCATTT TTGGTAGATG 10560 

ATGCTCCTAT GGTGGAGTTT TTGATGTCTT CACAGGAAAA AATTATTTCA CGGGTTGAAG 10620 

CCTTGCGTTT TTTGACTGAT TCATCTAAGA AAGGGATTTT AGTTTGTAAT ATCGCAGCAA 10680 

GTCGATTGAT TTTACCGTCT CCCAATGCAT TCAAAGATAG TATTGTAAAA ATCTCAGTTG 10740 

GTGAAGAATA TGATCAACAC GCGTTTATCC ATCAGTTAAA GGAAAATGGC TATCGAAAAG 10800 

TTACTCAAGT ACAAACTCAG GGCGAATTTA GTCTTCGAGG AGATATTTTA GATATTTTTG 10860 

AAATATCCCA GTTAGAACCT TGTCGAATTG A GT TTTTTGG TGATGAAATT GATGGTATCA 10920 

GGTCATTTGA AGTAGAAACA CAATTATCGA AAGAAAATAA GAGAGAACTG ACTATCTTTG 10980 

CAGCTAGTGA TATGCTTTTG AGAGAAAAGG ATTATCAACG AGGACAGTGA GCTTTAGAAA 11040 
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AACAAATTTC AAAAACTTTA TCACCTATTT TGAAATCATA 


CCTAGAAGAA ATTCTTTCAA 


11100 


GTTTTCACCA AAAACAAAGT CATGCAGACT CTCGGAAGTT 


TTTATCTTTG 


TGCTATGATA 


ill fin 


AGACATGGAC TGTCTTTGAT TATATTGAAA AAGATACTCC 


AATATTCTTT 


GATGATTATC 


i 


AAAAATTGAT GAATCAGTAT GAAGTCTTTG AAAGAGACTT 


AGCGCAGTAC 


TTTACAGAAG 


1 t o on 


AATTACAGAA TAGTAAAGCA TTTTCTGATA TGCAGTATTT 


TTCTGATATT 


GAACAAATCT 


11340 


ATAAAAAACA AAGTCCAGTG ACCTTTTTCT CTAATCTTCA 


AAAGGGTTTA 


GGAAATCTCA 


11400 


AATTTGACAA AATTTATCAA TTCAATCAAT ATCCTATGCA GGAATTTTTC 


AATCAGTTTT 


11460 


CTTTTCTAAA AGAAGAAATT GAACGATATA AAAAAATGGA 


TTACACCATT 


ATTCTGCAGT 


11520 


CTAGCAATTC AATGGGAAGT AAAACATTGG AGGATATGTT 


AGAGGAATAT 


GAGATTAAAT 


11580 


TGGATTCTAG AGATAAGACA AATATCTGTA AAGAATCTGT 


AAACTTAATA 


GAGGGTAATC 


11640 


TCAGACATGG TTTTCATTTT GTAGATGAAA AGATTTTATT 


GATAACTGAA 


CATGAGATTT 


11700 


TTCAAAAGAA ATTAAAGCGT CGTTTTCGAA GACAACATGT 


TTCAAATGCA 


GAGAGATTAA 


11760 


AAGATTACAA TGAACTTGAA AAAGGGGACT ATGTTGTCCA 


TCATATCCAT 


GGGATTGGTC 


11820 


AATATCTAGG AATTGAAACC ATTGAAATCA AGGGAATTCA 


TCGCGATTAT 


GTCAGTGTCC 


11880 


AATACCAAAA TGGTGATCAA ATTTCTATrf* PCfiTfinAAr'A 


GATTCATCTA 


CTGTCCAAAT 


11940 


ATATTTCAAG TGATGGTAAA GCTCCAAAAC TCAATAAATT 


AAATGACGGT 


CATTTTAAAA 


12000 


AGGCCAAGCA AAAGGTTAAG AACCAGGTAG AGGATATAGC 


TGATGATTTA ATCAAACTCT 


12060 


ACTCTGAACG TAGTCAGTTG AAGGGTTTTG CTTTCTCAGC 


TGATGATGAT 


GATCAAGATG 


12120 


CCTTTGATGA TGCTTTCCCT TATGTTGAAA CGGATGATCA 


ACTTCGTAGT 


ATTGAGGAAA 


12 180 


TCAAGAGGGA TATGCAGGCT TCTCAGCCAA TGGATCGACT 


TTTAGTTGGG 


GATGTTGGTT 


12240 


TTGGAAAGAC TGAAGTTGCT ATGCGTGCAG CCTTTAAAGC 


AGTCAATGAT 


CACAAACAGG 


12300 


TTGTCATTCT AGTTCCGACG ACGGTTTTAG CGCAACAGCA 


CTATACGAAT 


TTTAAGGAAC 


12360 


GATTCCAAAA TTTTGCAGTT AATATTGATG TGTTGAGTCG 


CTTTAGAAGT 


AAAAAAGAGC 


12420 


AGACTGCAAC ACTTGAAAAA TTGAAAAACG GTCAAGTCGA 


TATTTTGATT 


GGAACACATC 


12480 


GTGTTTTGTC AAAAGATGTT GTGTTTGCTG ATTTGGGCTT 


GATGATTATT 


GATGAGGAAC 


12540 


AGCGATTTGG TGTCAAGCAT AAGGAAACTT TGAAAGAACT 


GAAGAAACAA 


GTGGATGTCC 


12600 


TAACCTTGAC CGCTACGCCA ATCCCTCGTA CCCTCCATAT 


GTCTATGCTG 


GGAATCAGAG 


12660 


ATTTATCTGT TATTGAAACT CCGCCGACTA ATCGCTATCC 


TGTTCAGACC 


TATGTTTTGG 


12720 


AAAAGAATGA TAGTGTCATT CGTGATGCTG TCTTGCGTGA 


AATGGAGCGT 


GGAGGTCAAG 


12780 


TTTATTATCT TTACAACAAA GTTGACACAA TTGTTCAGAA 


GGTTTCAGAA 


TTACAGGAGT 


12840 
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TGATTCCGGA GGCTTCGATT GGATATGTTC ATGGTCGAAT GAGTGAAGTC CAGTTGGAAA 12900 

ATACTCTATT AGACTTTATT GAGGGACAAT ACGATATCTT GGTGACGACT ACTATTATTG 12960 

AGACAGGGGT GGACATTCCA AATGCTAATA CTTTATTTAT TGAAAATGCG GACCATATGG 13020 

GCTTGTCAAC CTTATATCAG TTAAGAGGAA GAGTCGGTCG TAGTAATCGT ATTGCTTATG 13080 

CTTATCTCAT GTATCGTCCA GAAAAATCAA TCAGTGAAGT CTCTGAAAAG AGATTAGAAG 13140 

CGATTAAAGG ATTTACAGAA TTGGGCTCTG GCTTTAAGAT TGCAATGCGA GATCTTTCGA 13200 

TTCGTGGAGC AGGAAATCTT TTAGGAAAAT CCCAGTCTGG TTTCATTGAT TCTGTTGGTT 13260 

TTGAATTGTA TTCGCAGTTA TTAGAGGAAG CTATTGCTAA ACGAAACGGT AATGCTAACG 13320 

CTAACACAAG AACCAAAGGG AATGCTGAGT TGATTTTGCA AATTGATGGC TATCTTCCTG 13380 

ATACTTATAT TTCTGATCAA CGACATAAGA TTGAAATTTA CAAGAAAATT CGTCAAATTG 13440 

ACAACCGTGT CAATTATGAA GAGTTACAAG AGGAGTTGAT AGACCGTTTT GGAGAATACC 13500 

CAGATGTAGT AGCCTATCTG TTAGAGATTG GTTTGGTCAA ATCATACTTG GACAAGGTCT 13560 

TTGTTCAACG TGTGGAAAGA AAAGATAATA AAATTACAAT TCAATTTGAA AAAGTCACTC 13620 

AACGACTGTT TTTAGCTCAA GATTATTTTA AAGCTTTATC CGTAACGAAC TTAAAAGCAG 13680 

GCATCGCTGA GAATAAGGGA TTAATGGAGC TTGTATTTGA TGTCCAAAAT AAGAAAGATT 13740 

ATGAAATTTT AGAAGGTTTG CTGATTTTTG GAGAAAGTTT ATTAGAGATA AAAGAGTCTA 13800 

AGGAAGAAAA TTCCATTTGA TATTTTTCTT CTATAAAATA GATAAAAATG GTACAATAAT 13860 

AAATTGAGGT AATAAGGATG AGATTAGATA AATATTTAAA AGTATCGCGA ATTATCAAGC 13920 

GTCGTACAGT CGCAAAGGAA GTAGCAGATA AAGGTAGAAT CAAGGTTAAT GGAATCTTGG 13980 

CCAAAAGTTC AACGGACTTG AAAGTTAATG ACCAAGTTGA AATTCGCTTT GGCAATAAGT 14040 

TGCTGCTTGT AAAAGTACTA GAGATGAAAG ATAGTACAAA AAAAGAAGAT GCAGCAGGAA 14100 

TGTATGAAAT TATCAGTGAA ACACGGGTAG AAGAAAATGT CTAAAAATAT TGTACAATTG 14160 

AATAATTCTT TTATTCAAAA TGAATACCAA CGTCGTCGCT ACCTGATGAA AGAACGACAA 14220 

AAACGGAATC GTTTTATGGG AGGGGTATTG ATTTTGATTA TGCTATTATT TATCTTGCCA 14280 

ACTTTTAATT TAGCGCAGAG TTATCAGCAA TTACTCCAAA GACGTCAGCA ATTAGCAGAC 14340 

TTGCAAACTC AGTATCAAAC TTTGAGTGAT GAAAAGGATA AGGAGACAGC ATTTGCTACC 14400 

AAGTTGAAAG ATGAAGATTA TGCTGCTAAA TATACACGAG CGAAGTACTA TTATTCTAAG 14460 

TCGAGGGAAA AAGTTTATAC GATTCCTGAC TTGCTTCAAA GGTGATAAAA TGGAAAATTT 14520 

ATTAGACGTA ATAGAGCAAT TTTTGAGTTT GTCAGATGAA AAGCTGGAAG AATTGGCTGA 14580 
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TAAAAATCAA TTATTGCGTT TACAAGAAGA AAAGGAAAGG AAGAATGCGT AAATTCTTAA 14640 

TTATTTTGTT GCTACCAAGT TTTTTGACCA TTTCAAAAGT CGTTAGCACA GAAAAAGAAG 14700 

TCGTCTATAC TTCGAAAGAA ATTTATTACC TTTCACAATC TGACTTTGGT ATTTATTTTA 14760 

GAGAAAAATT AAGTTCTCCC ATGGTTTATG GAGAGGTTCC TGTTTATGCG AATGAAGATT 14820 

TAGTAGTGGA ATCTGGGAAA TTGACTCCCA AAACAAGTTT TCAAATAACC GAGTGGCGCT 14880 

TAAATAAACA AGGAATTCCA GTATTTAAGC TATCAAATCA TCAATTTATA GCTGCGGACA 14940 

AACGATTTTT ATATGATCAA TCAGAGGTAA CTCCAACAAT AAAAAAAGTA TGGTTAGAAT 15000 

CTGACTTTAA ACTGTACAAT AGTCCTTATG ATTTAAAAGA AGTGAAATCA TCCTTATCAG 15060 

CTTATTCGCA AGTATCAATC GACAAGACCA TGTTTGTAGA AGGAAGAGAA TTTCTACATA 15120 

TTGATCAGGC TGGATGGGTA GCTAAAGAAT CAACTTCTGA AGAAGATAAT CGGATGAGTA 15180 

AAGTTCAAGA AATGTTATCT GAAAAATATC AGAAAGATTC TTTCTCTATT TATGTTAAGC 15240 

AACTGACTAC TGGAAAAGAA GCTGGTATCA ATCAAGATGA AAAGATGTAT GCAGCCAGCG 15300 

TTTTGAAACT CTCTTATCTC TATTATACGC AAGAAAAAAT AAATGAGGGT CTTTATCAGT 15360 

TAGATACGAC TGTAAAATAC GTATCTGCAG TCAATGATTT TCCAGGTTCT TATAAACCAG 15420 

AGGGAAGTGG TAGTCTTCCT AAAAAAGAAG ATAATAAAGA ATATTCTTTA AAGGATTTAA 15480 

TTACGAAAGT ATCAAAAGAA TCTGATAATG TAGCTCATAA TCTATTGGGA TATTACATTT 15540 

CAAACCAATC TGATGCCACA TTCAAATCCA AGATGTCTGC CATTATGGGA GATGATTGGG 15600 

ATCCAAAAGA AAAATTGATT TCTTCTAAGA TGGCCGGGAA GTTTATGGAA GCTATTTATA 15660 

ATCAAAATGG ATTTGTGCTA GAGTCTTTGA CTAAAACAGA TTTTGATAGT CAGCGAATTG 15720 

CCAAAGGTGT TTCTGTTAAA GTAGCTCATA AAATTGGAGA TGCGGATGAA TTTAAGCATG 15780 

ATACGGGTGT TGTCTATGCA GATTCTCCAT TTATTCTTTC TATTTTCACT AAGAATTCTG 15840 

ATTATGATAC GATTTCTAAG ATAGCCAAGG ATGTTTATGA GGTTCTAAAA TGAGGGAACC 15900 

AGATTTTTTA AATCATTTTC TCAAGAAGGG ATATTTCAAA AAGCATGCTA AGGCGGTTCT 15960 

AGCTCTTTCT GGTGGATTAG ATTCCATGTT TCTATTTAAG GTATTGTCTA CTTATCAAAA 16020 

AGAGTTAGAG ATTGAATTGA TTCTAGCTCA TGTGAATCAT AAGCAGAGAA TTGAATCAGA 16080 

TTGGGAAGAA AAGGAATTAA GGAAGTTGGC TGCTGAAGCA GAGCTTCCTA TTTATATCAG 16140 

CAATTTTTCA GGAGAATTTT CAGAAGCGCG TGCACGAAAT TTTCGTTATG ATTTTTTTCA 16200 

AGAGGTCATG AAAAAGACAG GTGCGACAGC TTTAGTCACT GCCCACCATG CTGATGATCA 16260 

GGTGGAAACG ATTTTTATGC GCTTGATTCG AGGAACTCGC TTGCGCTATC TATCAGGAAT 16320 

TAAGGAGAAG CAAGTAGTCG GAGAGATAGA AATCATTCGT CCCTTCTTGC ATTTTCAGAA 16380 
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AAAAGACTTT 


CCATCAATTT 


TTCACTTTGA 


AGATACATCA 


AATCAGGAGA 


ATCATTATTT 


16440 


TCGAAATCGT 


ATTCGAAATT 


CTTACTTACC 


AGAATTGGAA 


AAAGAAAATC 


CTCGATTTAG 


16500 


GGATGCAATC 


TTAGGCATTG 


GCAATGAAAT 


TTTAGATTAT 


GATTTGGCAA 


TAGCTGAATT 


16560 


ATCTAACAAT 


ATTAATGTGG 


AAGATTTACA 


GCAGTTATTT 


TCTTACTCTG 


AGTCTACACA 


16620 


AAGAGTTTTA 


CTTCAAACTT 


ATCTGAATCG 


TTTTCCAGAT 


TTGAATCTTA 


CAAAAGCTCA 


16680 


GTTTGCTGAA 


GTTCAGCAGA 


TTTTAAAATC 


TAAAAGCCAG 


TATCGTCATC 


CGATTAAAAA 


16740 


TGGCTATGAA 


TTGATAAAAG 


AGTACCAACA 


GTTTCAGATT 


TGTAAAATCA 


GTCCGCAGgC 


16800 


TGATGAAAAG 


GAAGATGAAC TTGTGTTACA CTATCAAAAT CAGGTAGCTT ATCAAGGATA 


16860 


TTTATTTTCT 


TTTGGACTTC 


CATTAGAAGG 


TGAATTAATT 


CAACAAATAC 


CTGTTTCACG 


16920 


TGAAACATCC 


ATACACATTC 


GTCATCGAAA 


AACAGGAGAT 


GTTTTGATTA 


AAAATGGGCA 


16980 


TAGAAAAAAA 


CTCAGACGTT 


TATTTATTGA 


TTTGAAAATC 


CCTATGGAAA 


AGAGAAACTC 


17040 


TGCTCTTATT 


ATTGAGCAAT 


TTGGTGAAAT 


TGTCTCAATT 


TTGGGAATTG 


CGACCAATAA 


17100 


TTTGAGTAAA 


AAAACGAAAA 


ATGATATAAT 


GAACACTGTA 


CTTTATATAG 


AAAAAATAGA 


17160 


TAGGTAAAAA 


ATGTTAGAAA 


ACGATATTAA 


AAAAGTCCTC 


GTTTCACACG 


ATGAAATTAC 


17220 


AGAAGCAGCT 


AAAAAACTAG 


GTGCTCAATT 


AACTAAAGAC 


TATGCAGGAA 


AAAATCCAAT 


17280 


CTTAGTTGGG 


ATTTTAAAAG 


GATCTATTCC 


TTTTATGGCT 


GAATTGGTCA 


AACATATTGA 


17340 


TACACATATT 


GAAATGGACT 


TCATGATGGT 


TTCTAGCTAC 


CATGGTGGAA 


CAGCAAGTAG 


17400 


TGGTGTTATC 


AATATTAAAC 


AAGATGTGAC 


TCAAGATATC 


AAAGGAAGAC 


ATGTTCTATT 


17460 


TGTAGAAGAT 


ATCATTGATA 


CAGGTCAAAC 


TTTGAAGAAT 


TTGCGAGATA 


TGTTTAAAGA 


17520 


AAGAGAAGCA 


GCTTCTGTTA 


AAATTGCAAC 


CTTGTTGGAT 


AAACCAGAAG 


GACGTGTTGT 


17580 


AGAAATTGAG 


GCAGACTATA 


CTTGCTTTAC 


TATCCCAAAT 


GAGTTTGTAG 


TAGGTTATGG 


17640 


TTTAGACTAC 


AAAGAAAATT 


ATCGTAATCT 


TCCTTATATT 


GGAGTATTGA 


AAGAGGAAGT 


17700 


GTATTCAAAT 


TAGAAAGAAT 


AATCTTTAAT 


GAAAAAACAA 


AATAATGGTT 


TAATTAAAAA 


17760 


TCCTTTTCTA 


TGGTTATTAT 


TTATCTTTTT 


CCTTGTGACA 


GGATTCCAGT 


ATTTCTATTC 


17820 


TGGGAATAAC 


TCAGGAGGAA 


GTCAGCAAAT 


CAACTATACT 


GAGTTGGTAC 


AAGAAATTAC 


17880 


CGATGGTAAT 


GTAAAAGAAT 


TAACTTACCA ACCAAATGGT 


AGTGTTATCG 


AAGTTTCTGG 


17940 


TGTCTATAAA 


AATCCTAAAA 


CAAGTAAAGA AGAAACAGGT 


ATTCAGTTTT 


TCACGCCATC 


18000 


TGTTACTAAG 


GTAGAGAAAT 


TTACCAGCAC 


TATTCTTCCT 


GCAGATACTA 


CCGTATCAGA 


18060 


ATTGCAAAAA CTTGCTACTG ACCATAAAGC AGAAGTAACT GTTAAGCATG 


AAAGTTCAAG 


18120 



t 
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TGGTATATGG ATTAATCTAC TCGTATCCAT TGTGCCATTT GGAATTCTAT TCTTCTTCCT 18180 
ATTCTCTATG ATGGGAAATA TGGGAGGAGG CAATGGCCGT AATCCAATGA GTTTTGGACG 18240 
TAGTAAGGCT AAAGCAGCAA ATAAAGAAGA TATTAAAGTA AGATTTTCAG ATGTTGCTGG 18300 
AGCTGAGGAA GAAAAACAAG AACTAGTTGA AGTTGTTGAG TTCTTAAAAG ATCCAAAACG 18360 
ATTCACAAAA CTTGGAGCCC GTATTCCAGC AGGTGTTCTT TTGGAGGGAC CTCCGGGGAC 18420 
AGGTAAAACT TTGCTTGCTA AGGCAGTCGC TGGAGAAGCA GGTGTTCCAT TCTTTAGTAT 18480 
CTCAGGTTCT GACTTTGTAG AAATGTTTGT CGGAGTTGGA GCTAGTCGTG TTCGCTCTCT 18540 
TTTTGAGGAT GCCAAAAAAG CAGCACCAGC TATCATCTTT ATCGATGAAA TTGATGCTGT 18600 

TGGACGTCAA CGTGGAGTCG GTCTCGGCGG AGGTAATGAC GAACGTGAAC AAACCTTGAA 18660 

CCAACTTTTG ATTGAGATGG ATGGTTTTGA GGGAAATGAA GGGATTATCG TCATCGCTGC 18720 

GACAAACCGT TCAGATGTAC TTGACCCTGC CCTTTTGCGT CCAGGACGTT TTGATAGAAA 18780 

AGTATTGGTT GGTCGTCCTG ATGTTAAAGG TCGTGAAGCA ATCTTGAAAG TTCACGCTAA 18840 

GAATAAGCCT TTAGCAGAAG ATGTTGATTT GAAATTAGTG GCTCAACAAA CTCCAGGCTT 18900 

TGTTGGTGCT GATTTAGAGA ATGTCTTGAA TGAAGCAGCT TTAGTTGCTG CTCGTCGCAA 18960 

TAAATCGATA ATTGATGCTT CAGATATTGA TGAAGCAGAA GATAGAGTTA TTGCTGGACC 19020 

TTCTAAGAAA GATAAGACAG TTTCACAAAA AGAACGAGAA TTGGTTGCTT ACCATGAGGC 19080 

AGGACATACC ATTGTTGGTC TAGTCTTGTC GAATGCTCGC GTTGTGCATA AGGTTACAAT 19140 

TGTACCACGC GGCCGTGCAG GCGGATACAT GATTGCACTT CCTAAAGAGG ATCAAATGCT 19200 

TCTATCTAAA GAAGATATGA AAGAGCAATT GGCTGGCTTA ATGGGTGGAC GTGTAGCTGA 19260 

AGAAATTATC TTTAATGTCC AAACCACAGG AGCTTCAAAC GACTTTGAAC AAGCGACACA 19320 

AATGGCACGT GCAATGQTTA CAGAGTACGG TATGAGTGAA AAACTTGGCC CAGTACAATA 19380 

TGAAGGAAAC CATGCTATGC TTGGTGCACA GAGTCCTCAA AAATCAATTT CAGAACAAAC 19440 

AGCTTATGAA ATTGATGAAG AGGTTCGTTC ATTATTAAAT GAGGCACGAA ATAAAGCTGC 19500 

TGAAATTATT CAGTCAAATC GTGAAACTCA CAAGTTAATT GCAGAAGCAT TATTGAAATA 19560 

CGAAACATTG GATAGTACAC AAATTAAAGC TCTTTACGAA ACAGGAAAGA TGCCTGAAGC 19620 

AGTAGAAGAG GAATCTCATG CACTATCCTA TGATGAAGTA AAGTCAAAAA TGAATGACGA 19680 

AAAATAACCC TGAGAGAGGC TGCAGCCTCT CTTTTTTGTG CAGTTTAGGA GCTAAAGGGA 19740 

ACAGAATGGA GAAAATGGAA CAAATGTGTT TTCTAATCTG TTAGACTGTA TCTAGAAAGG 19800 

GGAAAATTAT GATTAAAGAA TTGTATGAAG AAGTCCAAGG GACTGTGTAT AAGTGTAGAA 19860 

ATGAATATTA CCTTCATTTA TGGGAATTGT CGGATTGGGA GCAAGAAGGC ATGCTCTGCT 19920 
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TACATGAATT GATTAGTAGA GAAGAAGGAC TGGTAGACGA TATTCCACGT TTAAGGAAAT 19980 

ATTTCAAGAC CAAGTTTCGA AATCGAATTT TAGACTATAT CCGTAAACAG GAAAGTCAGA 20040 

AGCGTAGATA CGATAAAGAA CCCTATGAAG AAGTGGGTGA GATCAGTCAT CGTATAAGTG 20100 

AGGGGGGTCT CTGGCTAGAT GATTATTATC TCTTTCATGA AACACTAAGA GATTATAGAA 20160 

ACAAACAAAG TAAAGAGAAA CAAGAAGAAC TAGAACGCGT CTTAAGCAAT GAACGATTTC 20220 

GAGGGGGTCA AAGAGTATTA AGAGACTTAC GCATTGTGTT TAAGGAGTTT ACTATCGGTA 20280 

CCCACTAGTA AGTCATGCAA AAAAAATGAA AAAAATTAGA AAAAGTAGTT GACAAAGTTT 20340 

GAAAAGGCTG TATAATAGTA AGAGTTGAAA ATAACAACTC AGGTCCGTTG GTCAAGGGGT 20400 

TAAGACACCG CCTTTTCACG GCGGTAACAC GGGTTCGAAT CCCGTACGGA CTATGGTATG 20460 

TTGCGTCAGG ACCACTTGAT GAAAAAAAGT TTAAAAAAAC TTAAAAATCT TCAAAAAAGT 20520 

GTTGACAAGC GAAAGCAGTT GTGATATACT AATATAGTTG TCGCTTGAGA GAAGCAAGTG 20580 

ACAAAGACCT TTGAAAACTG AACAAGACGA ACCAATGTGC AGGGCGCTAC AACGTAAGTT 20640 

GTAGTACTGA ACAATGAAAA AAACAATAAA TCTGTCAGTG ACAGAAATGA GTAAGAACTC 20700 

AAACTTTTTA ATGAGAGTTT GATCCTGGCT CAGGACGAAC GCTGGCGGCG TGCCTAATAC 20760 

ATGCAAGTAG AACGCTGAAG GAGGAGCTTG CTTCTCTGGA TGAGTTGCGA ACGGGTGAGT 20820 

AACGCGTAGG TAACCTGCCT GGTAGCGGGG GATAACTATT GGAAACGATA GCTAATACCG 20880 

CATAAGAGTA GATGTTGCAT GACATTTGCT TAAAAGGTGC ACTTGCATCA CTACCAGATG 20940 

GACCTGCGTT GTATTAGCTA GTTGGTGGGG TAACGGCTCA CGAAGGCGAC GATACATAGC 21000 

CGACCTGAGA GGGTGATCGG CCACACTGGG ACTGAGACAC GGCCCAGACT CCTACGGGAG 21060 

GCAGCAGTAG GGAATCTTCG GCAATGGACG GAAGTCTGAC CGAGCAACGC CGCGTGAGTG 21120 

AAGAAGGTTT TCGGATCGTA AAGCTCTGTT GTAAGAGAAG AACGAGTGTG AGAGTGGAAA 21180 

GTTCACACTG TGACGGTATC TTACCAGAAA GGGACGGCTA ACTACGTGCC AGCAGCCG CG 21240 

GTAATACGTA GGTCCCGAGC GTTGTCCGGA TTTATTGGGC GTAAAGCGAG CGCAGGGGGT 21300 

TAGATAAGTC TGAAGTTAAA GGCTGTGGCT TAACCATA 21338 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6273 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

TGTTTTTAAA GAGCCGTGTC TGGATAGACT TTCGGACGCA ACGCTCTATT AGATAATGAA 60 

CTGCCTATAC ACAAGATTTC TAACCTTAGT CGACATGAGC TGAAACCTCT TATTTGTTAA 120 

GTAGTTCACA AAATATTATA CACCTATTTT ATGAATAGTC AACTGTCTTT ACAGTAAAAT 180 

TTTAGAAAAT CATGAAAATT TTCTCTTTCT TTCCATTTTA AGTGACATTC AGTCATTCTC 240 

ACATCAAAAA AGCCCAGACG AAATTGTCTG AGCATTCTTT TATCTAGTCG TTTAAGGAAG 300 

TTGAGTTCAG TATGTTTAAA GTCTCTGTCC CATCATTTCT TCAACAAACC TTGTTCTTGG 360 

AGAAACTCCT TGGCTACTTG CTTTGCTGAC TTGCCTTCAA CACCGACTTG GTAGTTGAGC 420 

TGGCTCATCT GGCTTTCTGT AATCTTACCA GCCAATGTAT TAAGAACTCT TTCCAACTCT 480 

GGGTGTTTCT TGAGAAGAGC TTCTTTCATG AGTGGAGCCC CTTGATAAGG TGGGAAGAGT 540 

TGCTTGTCAT CTTCCAAGAC CTGTAAATCA TAACGCTCCA ATTCCGCATC AGTCGAATAG 600 

GCATCCGTGA TTTGAATATC CCCTGACTGA ATAGCCTGAT AGCGAAGGGC TGGCTCAATG 660 

GTCGCTACAT TGAGATTGAG ACCATACATT GATTGCAAGC CCTTATTTCC ATCTTCACGG 720 

TCGTTAAACT CGAGTGTAAA ACCTGCCTTC AACTGCCCTT CCACTTTTTT CAAGTCTGAA 780 

ATGGTCTTCA AGCCATATTC TTGAGCAATC TTTTTCGGAA CAGCTACAGC ATAGGTGTTT 840 

TGATAAGACA TGGGTTTGAG ATAGGCTAGA TGATCCTGCT TAGCAATGCC ATCACGCGCC 900 

ACCTGATAAA CCTGTTCTGG TTCATGACTC ACCTTGGGTG ATGGTTGAAG CAAACTTTCA 960 

GTCACCGTAC CAGTAAATTC AGGATAGATG TCAATATCGC CTTTTTTCAG AGCTTCATAA 1020 

AGGAAGCTTG TCTTCCCAAA ATTCGGTTTA ACAGTCGCAG TCATGCTGGT ATTTTCTTCA 1080 

ATCAGCAACT TATACATATT GGCCAAAATT TCTGGTTCTG GACCTATTTT CCCAGCAATA 1140 

ACCAAGTTTT CCTTCTCTTT TTGAACCAAA AGAGCTGGAC TATAAGACAG ACCCAGTAAT 1200 

AAAGCCACCA AGGCAAAACC TGAGAAAATC GTCCGTAATT TTGCTTTTTC CATCACTTTT 1260 

AGTAGGAAGT TAAAGGCAAT GGCTAGCACT GCAGAAGAAA GTGCCCCAAT CAAAATCAAA 1320 

CTGGCATTAT TACGGTCAAT TCCCAAAAGA ATAAAGGAAC CTAGTCCCCC TGCACCAATC 1380 

AAGGCCGCCA AGGTTGCCGT ACCGATAATC AAAACAGCTG CCGTCCGAAT CCCAGACATG 1440 

ATAACAGGCA TGGCGAGTGG AATTTCAAAT TTCTTGAGAC GTTCCCATCT GGTCATCCCA 1500 

AAGGCAATCC CAGCCTCTTG CAGGTTCGGA TCAATTCCCT TCAGCCCAGT GATAGTATTT 1560 

TGCAAAATAG GGAAAATCGC ATAAATCACT AGAGCTGTCA AAGCCGGCAA GGTCCCAATT 1620 

CCCATCAAAG GGATAAAGAG CCCCAACAAG GCCAGAGACG GGATGGTCTG GAAAATACCT 1680 

GCAATCTGCA AGACCCAGTC GGCCAGCTTC TCATGATAGC GAAGAAAAAC AGCCAAGGGA 1740 
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ATCGCAAGCA AAATAGCTAG TAACAAGGTC AAAAGCGACA ACTGCAAATG 


TTGAGATAGA 


1800 


GCTGTCAACC AATCACTAAA ACGATCCTGA AAAGTTGCAA 


TTAAATTAGT 


CATGAACACT 


1860 


ACCTCCAAAC AAGTCTGCTA CAAAGTCTGT TGCAGGCGCT 


TTTAAAATTG 


TCTCGGGATT 


1920 


CGCTACCTGG CGAATTTCTC CATCCTGCAA GACAGCAATA CGGTCCGCCA ACTTCAAGGC 


1980 


TTCATCCGTA TCATGGGTTA CAAAAATCGT TGTCATCCCA AACTCTTTAT 


GCAATTCTTT 


2040 


TGTCAGAACC TGCAACTGTT TTCTCGAAAT AGCATCCAAG 


GCCGAAAAGG 


GTTCATCCAT 


2100 


GAGGAAAATC TTGGGCTGAC CAATCATAGC TCGGACAATA 


CCGACCCGTT 


GCTGTTCTCC 


2160 


ACCAGATAAT TCACTAGGTA AGCGATGCCC ATACTCGGCT 


ACTGGTAAAC 


CAACCTTAGC 


2220 


CAAAAGCTCT TCTGTTTTCT TCGTAATTTC TTCCTTGCTC 


CACCCCTTCA 


TTTCAGGAAT 


2280 


GAGAGCAATA TTTTCCGCAA CTGTTAGATT TGGAAAAAGA 


GCAATAGCCT 


GTAAAACATA 


2340 


ACCAGTAGAA AGACGAAGTT CACGCTCATC ATAGTCTTTG 


ATGCGCTTCC 


CATCCATATA 


2400 


AATATTTCCA TCAGTTGGTT CCAAAAGACG GTTAATCATC 


TTGAGCATGG 


TCGTCTTACC 


2460 


TGACCCAGAA GGCCCTACTA AAACCATAAA TTCCCCATCC 


TCAATCTGTA 


AGTTGACATC 


2520 


TCTCAAGACA TCCTTTTCTG TGTAGCGCAG TGCTACATTT 


TTGTATTCAA 


TCATTCTTTG 


2580 


TCCTCAATTT AAAACTTCCC TCGATTGGTC AAGTCTTCTA 


CCTTAGGCAT 


AACTTCCTTA 


2640 


TTATCCCAAT GCTCCACAAT TTTCCCGTTC TCTAAACGGA 


AGATATCGTA 


CTGGGCATAA 


2700 


GCAACGCCAT CAATCTGAGT CTGACCATAG CTAACCACAT 


AGTTTCCTTG 


TCCTAAGAGT 


2760 


TGGAAAACAA AGTCAAAAGT GACACTATAT TCAGCCACAT 


AGTTTTTATA 


AGCAGCACTT 


2820 


CCTTGTCCAA TATCATGATT ATGCTGAATC AAATCGTCTG 


CCACATAATC 


ACTCCACTGC 


2880 


TCTAGCTCCC CATTTTGGAA AATTTCTGTC AAGAAAGGGC 


GAACCAGCTT 


TTTATTTTCT 


2940 


GCTTTCTTAT CCAAATCCTT GATTTCAAAA TCTCCAAAAA 


TTTGATCTAG 


TTGGTCATTT 


3000 


TCAGGTGTTC GATAGTAGTC AATGACATCC CAATGCTCAA 


CAATACAACC 


ATTCTCATCC 


3060 


TCACGGAAAG TATCCGTCGT CACCCATTGA GCTTCTCCAC 


CATTCAGATA 


TTGATGAACA 


3120 


TGAACAAAGA CCAGATTGCC ATCCTCAATG GTGCGGACAA 


TCTTAATCTG 


ACGCTCTGGA 


3180 


TGACGCTCAA AGAAATCTGC AAAGAAGGCT GCAAATCCTT 


CTTTCCCGTC 


AGGAAC ACCT 


3240 


GTCGAATGTT GGATATAGGT ATCCCCTACA GACTGGGCTT 


GAGCCTCAGC 


AACTCGTCCG 


3300 


TCTTGAATGG CATGGATGTA TAGGTTGTGA GCATTTTTCA 


CTTGTTGTGA 


CATATTCTAA 


3360 


ACCTCATTTC CCTTCTCTTT CAGATTCGCC AAAATTCTTT 


CTTGAAAACC TTCAAATTGG 


3420 


TGAATTTCTT CCTCTGAAAA TCCTTTGTAA AAGATAGTAT 


CCAATTTCTG ACTGACACGA 


3480 
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TGCCCCACTT CTTTCTGGGA CTTGCCTAAC TCCGTTAAAA CTAAATACTT CTTACGCTTG 3540 

TCTTTTCCAC ACGGACTAAC AATTACAAGC TTTTGTTCCT CTAGCTTTTT TATCATAGTC 3600 

GTCAGCGTAT TATTCGCAAG TCCAGTCGCA AGCGCGATAT CTGTCGCAGT TGCGCAGCCA 3660 

GTTTCACTAT TCCATAAAAC CGCTAAAATC TTGCCCTGTT CACCCCTATA AAGAGCCTCA 3720 

GGATCTTGAC TCAGTAACTT TTGAAAAATC CGCCCATTCA ACAAACGAAT ATGATGGGCT 3780 

AGCAAATGAC CATCTTTCAT AACACCTCCA ATTTATTTCG ATATCGAAAT GAATAAAACA 3840 

ATTGTAACAC TCATCGTTCT AACTGTCAAC TATTTCGATT TAGAAATAAT TTTTGATAAT 3900 

TATCCACACC ACCATACTCC GGCTCAACTA ACTTTTAACG AGAGTTTCTA AACTCCTTCG 3960 

TCCTCCAGTC TACAAAAGCC TTCCATTCGT ACTATCCTAT ATTTTATGAG GGGACACATT 4020 

TTTCCTATCA GACCATTTAT TTTAAAGATA GAAGTAAATC ATAATTGCTT CCATCTGTTC 4080 

TTTTATAGTA TATTGAAGTT AGACTAGAGC ACTGTATCTT CTAAAACATT GATAGAAAGC 4140 

GATTTGAATT TCCCAATCAA TTTGTTCGTA TTTATAGCAT TTCGAAACTG GAATAGGACA 4200 

CCATGACTGC TAAAAGATTT CTATAAATTC ATTTAATTTC CTCAATCAAT TTGTTCATAT 4260 

CTTATTTCAT TCCGCTATAA TTTCACCTTA CCCTATCTTT TTCGTAGCAC CCTTCAAACA 4320 

GCCTATCCCC TACCGTTTGA CGATTCCTCA CTTCGCTCCA CTTCCATTAC AGAAGTTTCT 4380 

TCACTACTAT GGGCTCGGCT GACTTCTCAT GATTCCTTGT TACT ACT ATT TGAACGCTCA 4440 

CGAGATAGAT CTTACAAAAA ATGCTTTGAT CCACAATGGA ATCAAAGCAT TTTAAAGAGT 4500 

TCCTCATACA TAAGCGCAGA AGTCGCAGTT CCTCTGTACT TGGCTTCTTC TCTTTTGACA 4560 

AAGCGAGCCA AGTTGAGCAA CTCAGGTGCT GGATGTTTGG GATTTAGGAG CAATTCACGA 4620 

TTGACCAGGC CTGAGAGACG AACTGCCTGC AATTGCTCAT TTGTAGTAGG CAGTTTTTTA 4 680 

GTAGTCTCTA GGAGAGCAGC AACTAAATCT TCACTCAAAT CATGTCGAGC ATGATTGTAA 4740 

AGATCTTTTA TAAGGCTTTC TAGGTTTGGT TCTACCATCC CTACCACCTC CCTTATGGTT 4800 

TAATAATGTT TAATCAAATC AACCGTTGAA CGATCCAATT TCTTCACCAA GGCTTGTAAG 4860 

AAAGCTTGCG CTTCTAGGAA GTCATCCATT GCATAGAGGG TTTGGTGAGA ATGGATATAA 4920 

CGAGCGCAGA CACCGATAGT TGTTGATGGG ACACCACCAT TTTTCAGATG AGCTGCACCT 4980 

GCATCTGTTC CGCCTTTACC ACAGTAGTAT TGGTACTTGA TACCAGCTTC TTCAGCCGTT 5040 

GTCAAAAGGA AATCCTTCAT CCCTGGGAGA AGCAAGTGAC CTGGATCATA GAAACGAATC 5100 

AAGGTTCCAT CTCCAATCTT GCCTTGACCA CCGTAGACAT CACCTGCTGG TGAGCAATCA 5160 

ACTGCGAGGA AGACTTCTGG GTCAAACTTG GTTGTAGAGG TATGAGCGCC ACGCAGACCA 5220 

ACTTCTTCTT GGACGTTAGA ACCCAGATAG AGTTCATTGC CGAGTTTTTG ACCCGATAAA 5280 



WO 98/18931 



PCT/US97/19588 



273 



GCTTCAGCTA 


GCTCGCTTAC 


CATGAGGACA 


CCGTAGCGGT 


TATCCCAAGC 


TTTTGAGATG 


5340 


ATATTTTTTT 


CATTGGCTGT 


CAAAATTGCA 


GAACTATCTG 


GTACAATGGT 


ATCACCAGGA 


5400 


CGGATGCCAA 


AACTTTCTGC 


CTCAGCCTTG 


TCCGCAAAAC 


CACCATCAAA 


AACGATATCG 


5460 


GCAATGGCTG 


GCATGGTTGG 


TCCCCCCTTT 


CCACGAGTCA 


AATGCGGAGG 


AACAGAACCT 


5520 


GAAATCACAG 


GAATTTCATG 


ACCATCACGA 


GTCAAGAGTT 


TGAAACGTTG 


GCTGCTAACC 


5580 


ACCATGGGGT 


TCCAGCCACC 


GATTTCTACG 


ACACGGAAGG 


TACCATCTGG 


CTTGATTTCG 


5640 


CTGACCATAA 


AACCAACTTC 


GTCCATATGA 


GAAGCGACCA 


AGACGCGCGG 


TGCATCCACA 


5700 


GCTTCTGAAT 


GTTTGATACC 


AAAAATACCA 


CCCAAGCCAT 


CTGTCACCAC 


TTCATCCACA 


5760 


TGCGGTGTCA 


ACTTTTCACG 


AAGATAAGCA 


CGGACAGGCG 


CTTCATGACC 


TGAGACTGCA 


5820 


GCAAGTTCTG 


TTACTTCTTT 


AATTTTTGAA 


AATAATGTTG 


TCATTTCAGT TCCTTCTTTC 


5880 


TTTCATCCAT 


TTTACCACTT 


TTTATAGGAG 


AAGGATAGTG 


GGAAGGTGGA 


TTTCTAAGTT 


5940 


AGTATCTTAG 


TCCTGCTCTA 


TCTTAGAAAA 


GGATAGTATT 


CTCTTGCATG 


TAGTGCAAAA 


6000 


TCTAGTAAAC 


ATTCCAAAAT 


TAACTCGAAT 


ATTTATTTCC 


AAACAAAAAA ACAATACACC 


6060 


ATCAAAGTTG 


TTTGGATTTT 


TCATGAAATT 


TACAGAAAAT 


AGTTGACTTC 


CCTTTCTTCT 


6120 


TTCTTTAAAT 


ATATAGTTGG 


TTGAGTTTGG 


AATAGTACGC 


TGTAGCTGCT 


AAAACATTTG 


6180 


TAGAAATTAA 


TTTGACTTTC 


CTAATAGAGT 


TGTTCATATC 


TTATTTCAAT 


TTACTATAGT 


6240 


ACAAAACTAG 


AAAAGGAAAA 


AATCATGACC 


AGG 






6273 



(2) INFORMATION FOR SEQ ID NO: 22: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28171 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

ACAACCTTTT TCAAAAACTC ACCTTGGTAC GGAGATGTTT TGCTTTCTGC TATTATTTTC 60 

GGTTATATTC ATATCAATTT TGCTTTAACT CCTCTTGCTT TTTTCATTTA TGCTAGTGGA 120 

GGTCTTATTT TAGCTCTATT GTATCGCATG ACTAAAAATC TCTACTATCC AATACTAGTT 180 

CATATTCTCA TTAATATCAC TGCCTTCTGG GATGTGTGGT TGCTCCTATT TTCAGGAAGT 240 

TAGCTTACTA AAATAATGTC GGAACTTTCC GGCATTTTCT TTTTTCACAA ATAGTCAACG 300 

TTTTTCTTTT CGATATTGTA GTGGTGTGTA TCCAGTTATT TTTTTGAATT GATTTTGAAA 360 
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ATAAGGTTGA CTTGAGAAAG GCAGATAGTG AAGATAGTTA AGAAGAATAG GATGTTCTTT 420 

TTTCCTTTTT GGAAAACTTC TAAAATATGG TATAATGAAA AGATAAAGAA GTTGGGGGTA 480 

GAAGATGAAC ATTCAACAAT TACGCTATGT TGTGGCTATT GCCAATAGTG GTACTTTTCG 540 

TGAAGCTGCT GAAAAGATGT ATGTTAGTCA GCCGAGTCTG TCTATTTCTG TTCGTGATTT 600 

GGAAAAAGAG TTGGGCTTTA AGATTTTCCG TCGGACCAGC TCAGGGACTT TCTTGACCCG 660 

TCGTGGGATG GAATTTTATG AAAAATCGCA AGAATTGGTT AAAGGATTTG ATATTTTTCA 720 

AAATCAGTAT GCCAATCCTG AAGAAGAAAA AGATGAATTT TCTGTTGCTA GCCAGCACTA 780 

TGACTTCTTG CCACCAACTA TTACGGCCTT TTCAGAGCGC TATCCTGACT ATAAGAACTT 840 

CCGTATTTTT GAATCAACTA CTGTTCAAAT ATTAGATGAA GTGGCGCAAG GGCATAGTGA 900 

GATTGGGATT ATCTACCTCA ACAATCAAAA TAAAAAGGGG ATTATGCAAC GGGTTGAAAA 960 

ATTAGGTCTG GAGGTCATCG AATTGATTCC TTTCCATACC CATATTTATC TCCGTGAGGG 1020 

TCATCCTTTA GCCCAGAAAG AGGAATTAGT CATGGAGGAT TTAGCGGATT TACCAACGGT 1080 

TCGTTTCACT CAAGAGAAAG ACGAGTACCT TTATTATTCA GAGAACTTTG TCGATACCAG 1140 

CGCTAGCTCA CAGATGTTTA ATGTGACAGA CCGTGCCACC TTGAATGGTA TTTTGGAGCG 1200 

GACGGACGCC TATGCGACAG GTTCTGGATT TTTAGATAGT GACAGTGTTA ATGGCATTAC 1260 

AGTTATTCGT CTCAAGGATA ACCTAGATAA CCGCATGGTC TATGTTAAAC GTGAAGAAGT 1320 

GGAGCTTAGT CAAGCTGGGA CTCTCTTCGT AGAAGTCATG CAAGAATATT TTGATCAAAA 1380 

GAGGAAATCA TGAAAAAAAG AGCAATAGTG GCAGTCATTG TACTGCTTTT GATTGGGCTG 1440 

GATCAGTTGG TCAAATCCTA TATCGTCCAG CAGATTCCAC TGGGTGAAGT GCGCTCCTGG 1500 

ATCCCCAATT TCGTTAGCTT GACCTACCTG CAAAATCGAG GTGCAGCCTT TTCTATCTTA 1560 

CAAGATCAGC AGCTGTTATT CGCTGTCATT ACTCTGGTTG TCGTGATAGG TGCCATTTGG 1620 

TATTTACATA AACACATGGA GGACTCATTC TGGATGGTCT TGGGTTTGAC TCTAATAATC 1680 

GCGGGTGGTC TTGGAAACTT TATTGACAGG GTCAGTCAGG GCTTTGTTGT GGATATGTTC 1740 

CACCTTGACT TTATCAACTT TGCAATTTTC AATGTGGCAG ATAGCTATCT GACGGTTGGA 1800 

GTGATTATTT TATTGATTGC AATGCTAAAA GAGGAAATAA ATGGAAATTA AAATTGAAAC I860 

TGGTGGTCTG CGTTTGGATA AGGCTTTGTC AGATTTGTCA GAATTATCAC GTAGTCTCGG 1920 

GAATGAACAA ATTAAATCAG GCCAGGTCTT GGTCAATGGT CAAGTCAAGA AAGCTAAATA 1980 

CACAGTCCAA GAGGGTGATG TCGTCACTTA CCATGTGCCA GAACCAGAGG TATTAGAGTA 2040 

TGTGGCTGAG GATCTTCCGC TAGAAATAGT CTACCAAGAT GAGGATGTGG CTGTCGTTAA 2100 

CAAACCTCAG GGAATGGTTG TGCACCCGAG TGCTGGTCAT ACCAGTGGAA CCCTAGTAAA 2160 
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TGCCCTCATG 


TATCATATTA 


AGGACTTGTC 


GGGTATCAAT 


GGGGTTCTGC 


GTCCAGGGAT 


2220 


TGTTCACCGT 


ATTGATAAGG 


ATACGTCAGG 


TCTTCTCATG 


ATTGCTAAAA 


ACGATGATGC 


2280 


GCATCTAGCA 


CTTGCCCAAG 


AACTCAAGGA 


TAAAAAGTCT 


CTCCGCAAAT 


ATTGGGCGAT 


2340 


TGTTCATGGA 


AATCTACCTA 


ATGATCGTGG 


TGTAATTGAA 


GCGCCGATTG 


GCCGGAGTGA 


2400 


AAAAGACCGT 


AAGAAACAGG 


CTGTAACTGC 


TAAAGGGAAG 


CCTGCAGTGA 


CGCGTTTTCA 


2460 


CGTCTTGGAA 


CGCTTTGGCG 


ATTATAGCTT 


AGTAGAGTTG 


GAACTGGAGA 


CAGGGCGCAC 


2520 


TCATCAAATC 


CGTGTCCACA 


TGGCTTATAT 


CGGCCATCCA GTCGCTGGTG ATGAGGTCTA 


2580 


TGGTCCTCGC 


AAGACTTTGA 


AAGGACATGG 


ACAATTTCTT 


CATGCCAAGA 


CTTTAGGTTT 


2640 


TACTCATCCG 


AGAACAGGTA 


AGACCTTGGA 


ATTTAAAGCA 


GATATCCCAG 


AGATTTTTAA 


2700 


GGAAACCTTG 


GAGAGATTGA 


GAAAGTAAGA 


ATGAAAAAGA 


AATTAACTAG 


TTTAGCACTT 


2760 


GTAGGCGCTT 


TTTTAGGTTT 


GTCATGGTAT 


GGGAATGTTC 


AGGCTCAAGA 


AAGTTCAGGA 


2820 


AATAAAATCC 


ACTTTATCAA 


TGTTCAAGAA 


GGTGGCAGTG 


ATGCGATTAT 


TCTTGAAAGC 


2880 


AATGGACATT 


TTGCCATGGT 


GGATACAGGA 


GAAGATTATG 


ATTTCCCAGA 


TGGAAGTGAT 


2940 


TCTCGCTATC 


CATGGAGAGA 


AGGAATTGAA 


ACGTCTTATA 


AGCATGTTCT 


AACAGACCGT 


3000 


GTCTTTCGTC 


GTTTGAAGGA 


ATTGGGTGTC 


CAAAAACTTG 


ATTTTATTTT 


GGTGACCCAT 


3060 


ACCCACAGTG 


ATCATATTGG 


AAATGTTGAT 


GAATTACTGT 


CTACCTATCC 


AGTTGACCGA 


3120 


GTCTATCTTA 


AGAAATATAG 


TGATAGTCGT 


ATTACTAATT 


CTGAACGTCT 


ATGGGATAAT 


3180 


CTGTATGGCT 


ATGATAAGGT 


TTTACAGACT 


GCTGCAGAAA 


AAGGTGTTTC 


AGTTATTCAA 


3240 


AATATCACAC 


AAGGGGATGC 


TCATTTTCAG 


TTTGGGGACA 


TGGATATTCA 


GCTCTATAAT 


3300 


TATGAAAATG 


AAACTGATTC 


ATCGGGTGAA 


TTAAAGAAAA 


TTTGGGATGA 


CAATTCCAAT 


3360 


TCCTTGATTA 


GCGTGGTGAA AGTCAATGGC 


AAGAAAATTT 


ACCTTGGGGG 


CGATTTAGAT 


3420 


AATGTTCATG 


GAGCAGAAGA 


CAAGTATGGT 


CCTCTCATTG 


GAAAAGTTGA 


TTTGATGAAG 


3480 


TTTAATCATC 


ACCATGATAC 


CAACAAATCA 


AATACCAAGG ATTTCATTAA AAATTTGAGT 


3540 


CCGAGTTTGA 


TTGTTCAAAC 


TTCGGATAGT 


CTACCTTGGA 


AAAATGGTGT 


TGATAGTGAG 


3600 


TATGTTAATT 


GGCTCAAAGA 


ACGAGGAATT 


GAGAGAATCA 


ACGCAGCCAG 


CAAAGACTAT 


3660 


GATGCAACAG 


TTTTTGATAT 


TCGAAAAGAC 


GGTTTTGTCA 


ATATTTCAAC 


ATCCTACAAG 


3720 


CCGATTCCAA 


GTTTTCAAGC 


TGGTTGGCAT 


AAGAGTGCAT 


ATGGGAACTG 


GTGGTATCAA 


3780 


GCGCCTGATT 


CTACAGGAGA 


GTATGCTGTC 


GGTTGGAATG 


AAATCGAAGG 


TGAATGGTAT 


3840 


TACTTTAACC 


AAACGGGTAT 


CTTGTTACAG 


AATCAATGGA 


AAAAATGGAA 


CAATCATTGG 


3900 
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TTCTATTTGA CAGACTCTGG TGCTTCTGCT AAAAATTGGA AGAAAATCGC TGGAATCTGG 3960 

TATTATTTTA ACAAAGAAAA CCAGATGGAA ATTGGTTGGA TTCAAGATAA AGAGCAGTGG 4020 

TATTATTTGG ATGTTGATGG TTCTATGAAG ACAGGATGGC TTCAATATAT GGGGCAATGG 4080 

TATTACTTTG CTCCATCAGG GGAAATGAAA ATGGGCTGGG TAAAAGATAA AGAAACCTGG 4140 

TACTATATGG ATTCTACTGG TGTCATGAAG ACAGGTGAGA TAGAAGTTGC TGGTCAACAT 4200 

TATTATCTGG AAGATTCAGG AGCTATGAAG CAAGGCTGGC ATAAAAAGGC AAATGATTGG 4260 

TATTTCTACA AGACAGACGG TTCACGAGCT GTGGGTTGGA TCAAGGACAA GGATAAATGG 4320 

TACTTCTTGA AAGAAAATGG TCAATTACTT GTGAACGGTA AGACACCAGA AGGTTATACT 4380 

GTGGATTCAA GTGGTGCCTG GTTAGTGGAT GTTTCGATCG AGAAATCTGC TACAATTAAA 4440 

ACTACAAGTC ATTCAGAAAT AAAAGAATCC AAAGAAGTAG TGAAAAAGGA TCTTGAAAAT 4500 

AAAGAAACGA GTCAACATGA AAGTGTTACA AATTTTTCAA CTAGTCAAGA TTTGACATCC 4560 

TCAACTTCAC AAAGCTCTGA AACGAGTGTA AACAAATCGG AATCAGAACA GTAGTAGAAA 4620 

AGAAGGTTTT AGGGCCTTCT TTTTCCTATC AACTCTTTTC TATTTCCTGT TATTCATGTT 4680 

ATAATGGATA AATATGAATA ATCGGAGTGA GACTATGAAA TACAAACGGA TTGTCTTTAA 4740 

GGTGGGTACT TCTTCTCTGA CAAATGAGGA TGGAAGTTTA TCACGTAGTA AGGTAAAGGA 4800 

TATTACCCAG CAGTTGGCTA TGCTGCACGA GGCTGGTCAT GAGTTGATTT TGGTGTCTTC 4860 

AGGTGCCATT GCGGCTGGTT TTGGAGCCTT AGGATTTAAA AAGCGTCCGA CTAAGATTGC 4 920 

TGATAAACAG GCTTCAGCAG CGGTAGGGCA GGGGCTTTTG TTGGAAGAAT ATACAACCAA 4980 

TCTTCTCTTG CGTCAAATCG TTTCTGCACA AATCTTGCTG ACCCAAGATG ACTTTGTGGA 5040 

TAAGCGTCGT TATAAAAATG CCCATCAGGC TTTGTCGGTT TTGCTCAACC GTGGGGCAAT 5100 

TCCTATCATC AATGAGAATG ATAGTGTCGT TATTGATGAG CTCAAGGTTG GGGACAATGA 5160 

CACTCTAAGT GCTCAAGTAG CGGCGATGGT CCAAGCAGAC CTTTTAGTTT TCTTGACAGA 5220 

TGTGGACGGT CTCTATACTG GAAATCCTAA TTCAGATCCA AGAGCCAAAC GCTTGGAGAG 5280 

AATCGAGACC ATCAATCGTG AGATTATTGA TATGGCTGGT GGAGCTGGTT CGTCAAACGG 5340 

AACTGGGGGT ATGTTAACCA AAATCAAGGC TGCAACTATC GCGACGGAAT CAGGAGTTGC 5400 

TGTTTATATC TGCTCATCCT TGAAATCAGA TTCCATGATT GAGGCGGCAG AGGAGACCGA 5460 

GGATGGTTCT TACTTTGTTG CTCAAGAGAA GGGGCTTCGT ACCCAGAAAC AATGGCTTGC 5520 

CTTCTATGCT CAGAGTCAAG GTTCTATTTG GGTTGATAAA GGGGCTGCGG AAGCTCTCTC 5580 

TCAATATGGA AAGAGTCTTC TCTTATCTGG TATCGTTGAA GCAGAAGGAG TCTTTTCTTA 5640 

CGGTGATATC GTGACAGTAT TTGACAAGGA AAGTGGAAAA TCACTTGGAA AAGGACGGGT 5700 
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GCAATTTGGA GCATCTGCTT TGGAGGATAT GTTGCGTTCT CAAAAAGCCA AGGGTGTCTT 5760 

GATTTACCGT GACGACTGGA TTTCCATTAC TCCTGAAATC CAACTACTTT TTACAGAATT 5820 

TTAGAGGTAA ACTATGGTGA GTAGACAAGA ACAATTTGAA CAGGTACAGG CTGTTAAAAA 5880 

ATCGATTAAC ACAGCTAGTG AAGAAGTGAA AAACCAAGCC TTGCTAGCCA TGGCTGATCA 5940 

CTTAGTGGCT GCTACTGAGG AAATTTTAGC GGCTAATGCC CTCGATATGG CAGCGGGTAA 6000 

GGGGAAAATC TCAGATGTGA TGTTGGATCG TCTTTATTTG GATGCAGATC GTATAGAAGC 6060 

GATGGCAAGA GGAATTCGTG AAGTGGTTGC CTTACCAGAT CCAATCGGTG AAGTTTTAGA 6120 

AACAAGTCAG CTTGAAAATG GTTTGGTTAT CACAAAAAAA CGTGTAGCTA TGGGTGTCAT 6180 

CGGTATTATC TATGAAAGCC GTCCAAATGT GACGTCTGAT GCGGCTGCTT TGACTCTTAA 6240 

GAGTGGAAAT GCGGTTGTTC TTCGTAGTGG TAAGGATGCC TATCAAACAA CCCATGCCAT 6300 

TGTCACAGCC TTGAAGAAGG GCTTGGAGAC GACTACTATT CATCCAAATG TGATTCAACT 6360 

GGTGGAGGAT ACTAGCCGTG AAAGTAGTTA TGCTATGATG AAGGCCAAGG GCTATCTAGA 6420 

CCTTCTCATT CCTCGTGGAG GAGCTGGCTT GATCAATGCA GTGGTTGAGA ATGCGATTGT 6480 

ACCTGTTATC GAGACAGGGA CTGGGATTGT CCATGTCTAT GTGGATAAGG ATGCAGACGA 6540 

AGACAAGGCG CTGTCTATCA TCAACAATGC TAAAACCAGT CGTCCTTCTG TTTGTAATGC 6600 

CATGGAGGTT CTGCTGGTTC ATGAAAACAA GGCAGCAAGC TTCCTTCCTC GCTTGGAGCA 6660 

AGTGTTGGTT GCAGAGCGTA AGGAAGCTGG ACTGGAACCA ATTCAATTCC GCCTAGATAG 6720 

CAAAGCAAGC CAGTTTGTTT CAGGTCAAGC AGCTGAGACC CAAGACTTTG ACACCGAGTT 6780 

TTTAGACTAT GTCCTTGCTG TTAAGGTTGT GAGCAGTTTA GAAGAAGCGG TTGCGCACAT 6840 

TGAATCCCAC AGCACCCATC ATTCGGATGC TATTGTGACG GAAAATGCTG AAGCTGCAGC 6900 

ATACTTTACA GATCAAGTGG ACTCTGCAGC GGTGTATGTT AATGCCTCAA CTCGTTTCAC 6960 

AGATGGAGGA CAATTTGGTC TTGGTTGTGA AATGGGGATT TCTACTCAGA AATTGCACGC 7020 

GCGTGGTCCC ATGGGCTTGA AAGAGTTGAC CAGCTACAAG TATGTGGTTG CCGGTGATGG 7080 

GCAGATAAGG GAGTAAGAGA TGAAGATTGG ATTTATCGGT TTGGGGAATA TGGGTGCTAG 7140 

CTTGGCAAAA TCTGTCTTGC AGACTAGGAC GTCAGATGAG ATTCTCCTTG CCAATCGTAG 7200 

TCAAGCTAAG GTAGATGCTT TCATTGCAGA CTTTGGTGGT CAGGCTTCCA GCAATGAAGA 7260 

AATGTTTGCA GAAGCAGATG TGATTTTTCT AGGAGTTAAG CCTGCTCAGT TTTCTGAACT 7320 

GCTTTCTCAA TACCAGACCA TCCTTGAAAA AAGAGAAAGT CTTCTTTTGA TTTCGATGGC 7380 

AGCTGGATTG ACCTTAGAAA AACTAGCAAG TCTTATCCCA AGTCAACACC GAATTATTCG 7440 
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TATGATGCCT AATACCCCTG CTTCTATCGG GCAAGGAGTG ATTAGTTATG CCTTGTCTCC 7500 
TAATTGCAGG GCTGAGGACA GTGAGCTCTT TTATCAGCTT TTAGCCAAGG CTGGTCTCTT 7560 
GGTTGAACTA GGAGAAAGTT TAATCGATGC AGCGACAGGT CTTGCAGGTT GTGGACCAGC 7620 
CTTTGTCTAT CTTTTTATCG AGGCCTTGGC AGATGCAGGT GTTCAGACAG GATTACCACG 7680 
AGAAATAGCA TTGAAAATGG CAGCACAAAC TGTGGTAGGA GCTGGGCAAT TGGTCCTTGA 7740 
AAGTCAGCAA CATCCTGGAG TATTGAAAGA CCAAGTCTGT AGCCCAGGCG GTTCGACTAT 7800 
CGCTGGTGTA GCAAGCCTAG AAGCGCATGC TTTCCGAGGA ACAGTCATGG ATGCAGTTCA 7860 
TCAAGCCTAC AAACGAACAC AAGAACTAGG TAAATAAGAG GTAGTTTTGA CTGCCTCTTT 7920 
TATGGTGGCT GAAATGAGAA GACACAAAAA GATTGTCACA AACCCCTATT TTTTTGATAG 7980 
AATAGAAGTA GTAAAAAAGA AATGAGTTAG ACATGTCAAA AGGATTTTTA GTCTCTCTTG 8040 
AGGGACCAGA GGGAGCAGGC AAGACCAGTG TTTTAGAGGC TCTGCTACCA ATTTTAGAGG 8100 
AAAAAGGAGT AGAGGTGTTG ACGACCCGTG AACCTGGCGG AGTCTTGATT GGGGAGAAGA 8160 

TTCGGGAAGT GATTTTGGAT CCAAGTCATA CTCAGATGGA TGCTAAAACA GAGCTACTTC 8220 

TCTATATTGC CAGTCGCAGA CAGCATTTGG TGGAAAAAGT TCTTCCAGCC CTTGAAGCTG 8280 

GCAAGTTGGT CATCATGGAT CGTTTTATCG ATAGTTCTGT TGCCTATCAG GGATTTGGTC 8340 

GTGGCTTAGA TATTGAAGCC ATTGACTGGC TCAATCAGTT TGCGACAGAT GGCCTCAAAC 8400 

CCGATTTGAC ACTCTATTTT GACATCGAGG TGGAAGAAGG GCTGGCTCGT ATTGCTGCTA 8460 

ATAGTGACCG CGAGGTTAAT CGTTTGGATT TGGAAGGGTT GGACTTGCAT AAAAAAGTTC 8520 

GTCAAGGCTA CCTTTCTCTT CTGGATAAAG AGGGAAATCG CATTGTCAAG ATTGATGCTA 8580 

GTCTCCCTTT GGAGCAAGTT GTGGAAACTA CCAAGGCTGT CTTGTTTGAC GGAATGGGCT 8640 

TGGCCAAATG AAACAAGATC AACTAAAGGC TTGGCAACCA GCTCAGTTTG ACCGTTTTGT 8700 

CCGTATCTTA GAACAAGACC AGCTCAATCA CGCCTATCTC TTTTCAGGTT TCTTTGAAAG 8760 

CTTGGAAATG GCGCAATTTT TAGCTAAGAG CCTCTTTTGT ACGGATAAAG TTGGCGTCTT 8820 

ACCATGTGAG AAATGCOGAA GTTGCAAGCT GATTGAACAG GGAGAATTTC CCGATGTCAC 8880 

CTTGATTAAA CCAGTTAATC AGGTCATTAA GACGGAACGC ATTCGAGAAT TGGTGGGTCA 8940 

GTTTTCTCAA GCAGGGATTG AAAGCCAGCA ACAGGTCTTT ATCATCGAGC AAGCGGATAA 9000 

AATGCATCCC AACGCAGCCA ATTCTCTGCT CAAGGTCATC GAAGAACCCC AGAGTGAAGT 9060 

TTATATTTTC TTCTTGACTA GCGATGAGGA AAAGATGTTA CCGACAATCC GAAGTCGGAC 9120 

TCAGATCTTC CACTTTAAAA AGCAAGAAGA AAAACTTATC TTACTCTTAG AACAAATGGG 9180 

ACTTGTTAAG AAAAAAGCGA CTCTTTTAGC TAAGTTTAGT CAATCGCGAG CTGAAGCAGA 9240 
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AAAGTTGGCT 


AATCAGGCAA 


GTTTTTGGAC 


CTTGGTCGAT 


GAAAGTGAAC 


GCCTGCTGAC 


9300 


TTGGTTAGTA 


GCTAAGAAAA 


AAGAAAGTTA 


TCTACAGGTT 


GCCAAATTAG 


CCAACTTGGC 


9360 


AGATGATAAG 


GAAAAACAGG 


ATCAGGTTTT 


ACGGATTCTT 


GAAGTTCTCT 


GTGGGCAGGA 


9420 


CCTCTTGCAG 


GTAAGAGTAA 


GAGTGATTCT 


ACAAGATTTA 


CTAGAAGCTA 


GAAAAATGTG 


9480 


GCAAGCTAAT 


GTCAGCTTTC 


AAAATGCCAT 


GGAATATCTG 


GTCTTGAAAG 


AAATATAAAC 


9540 


TCAAAAATGA 


ATGATAAAGA 


AAGGAAAGGG 


CTGTTTTATG 


GACAAAAAAG 


AATTATTTGA 


9600 


CGCGCTGGAT 


GATTTTTCCC 


AACAATTATT 


GGTAACCTTA 


GCCGATGTGG 


AAGCGATCAA 


9660 


GAAAAATCTC 


AAGAGCGTGG 


TAGAGGAAAA 


TACAGCTCTT 


CGCTTGGAAA 


ATAGTAAGTT 


9720 


GCGAGAACGC 


TTGGGTGAGG 


TGGAAGCAGA 


TGCTCCTGTC 


AAGGCCAAGC 


ATGTTCGTGA 


9780 


AAGTGTCCGT 


CGCATTTACC 


GTGATGGATT 


TCACGTATGT 


AATGATTTTT 


ATGGACAACG 


9840 


TCGAGAGCAG 


GACGAGGAAT 


GTATGTTTTG 


TGACGAGTTG 


CTATACAGGG 


AGTAGGCATG 


9900 


CAGATTCAAA 


AAAGTTTTAA 


GGGGCAGTCT 


CCCTATGGCA AGCTGTATCT 


AGTGGCAACG 


9960 


CCGATTGGCA 


ATCTAGATGA 


TATGACTTTT 


CGTGCTATCC 


AGACCTTGAA 


AGAAGTGGAC 


10020 


TGGATTGCTG 


CTGAGGATAC 


GCGCAATACA 


GGGCTTTTGC 


TCAAGCATTT 


TGACATTTCC 


10080 


ACCAAGCAGA 


TCAGTTTTCA 


TGAGCACAAT 


GCCAAGGAAA 


AAATTCCTGA 


TTTGATTGGT 


10140 


TTCTTGAAAG 


CAGGGCAAAG 


TATTGCTCAG 


GTCTCTGATG 


CCGGTTTGCC 


TAGCATTTCA 


10200 


GACCCTGGTC 


ATGATTTAGT 


TAAGGCAGCT 


ATTGAGGAAG 


AAATTGCAGT 


TGTGACAGTT 


10260 


CCAGGTGCCT 


CTGCAGGAAT 


TTCTGCCTTG 


ATTGCCAGTG 


GTTTAGCGCC 


ACAGCCACAT 


10320 


ATCTTTTACG 


GTTTTTTACC 


GAGAAAATCA 


GGTCAGCAGA 


AGCAATTTTT 


TGGCTTGAAA 


10380 


AAAGATTATC 


CTGAAACACA 


GATTTTTTAT 


GAATCACCTC 


ATCGTGTAGC 


AGACACGTTG 


10440 


GAAAATATGT 


TAGAAGTCTA 


CGGTGACCGC 


TCCGTTGTCT 


TGGTCAGGGA 


ATTGACCAAA 


10500 


ATCTATGAAG 


AATACCAACG 


AGGTACTATC 


TCTGAGTTAT 


TAGAAAGCAT 


TGCTGAAACG 


10560 


CCAGTCAAGG 


GCGAATGTCT 


TCTCATTGTT 


GAGGGTGGCA 


GTCAGGGTGT 


GGAGGAAAAG 


10620 


GACGAGGAAG 


ACTTGTTCGT 


AGAAATTCAA 


ACCCGCATCC 


AGCAAGGTGT 


GAAGAAAAAC 


10680 


CAAGCTATCA 


AGGAAGTCGC 


TAAGATTTAC 


CAGTGGAATA AAAGTCAGCT 


CTACGCTGCC 


10740 


TACCACGACT 


GGGAAGAAAA 


ACAATAAAGG 


GAGACAGGAT 


GTAATAATTC 


TGTCTGTTTC 


10800 


TGTTTAACTT 


AATTAGTGAT 


GATAATATAA 


AGATGTATCA 


CTTGGTATAG 


AAGCTTTGGT 


10860 


ATTAAGTTTT 


TTATTAAGCC 


CATACGGAAT 


ACCGATGGTT 


GGAGCAGCAG 


TTATAGCGTT 


10920 


CTTAGAAGGT 


ATAAATAGAA 


AAATAAGGTC 


ATTTTAAATC 


AAAGGATTGA 


TAAATCAGAA 


10980 



WO 98/18931 



PCT/US97/19588 



280 

AGAAGGTGAT TTTTTGCGAA CATACGAAAA TAAAGAAGAA CTAAAAGCTG AGATAGAGAA 11040 

AACATTTGAG AAATATATTT TAGAATTTGA TAATATTCCA GAAAATTTAA AAGATAAGAG 11100 

AGCTGATGAA GTTGACAGAA CTCCAGCAGA AAACCTTGCT TATCAGGTTG GTTGGACCAA 11160 

CTTGGTTCTT AAATGGGAAG AAGATGAAAG AAAGGGGCTT CAAGTAAAAA CACCATCGGA 11220 

TAAATTTAAA TGGAATCAAC TTGGTGAATT ATATCAGTGG TTCACAGATA CCTACGCTCA 11280 

TTTATCTCTG CAAGAGTTGA AAGCAAAATT AAATGAAAAT ATTAATTCTA TCTCTGCAAT 11340 

GATTGATTCG TTGAGTGAGG AAGAATTATT TGAACCGCAT ATGAGAAAGT GGGCTGATGA 11400 

AGCGACTAAA ACAGCGACTT GGGAAGTGTA TAAGTTTATT CATGTAAATA CGGTTGCACC 11460 

TTTTGGAACT TTCAGAACTA AAATCAGAAA ATGGAAGAAG ATAGTATTAT AAATTATATT 11520 

TTTAACTTTA AAAAATTTCA TAAAAATGGT TACCAAAGGC GATAGAAGAA AAACTATCGT 11580 

CTTTTTCTTT GCAAATTTTT AAGAAGGGAG GTGATCTTGC ATGGACTTTG AATATTTTTA 11640 

TAACAGAGAA GCGGAAAGAT TTAACTTCTT AAAAGTACCG GAGATATTAG TTGATAGAGA 11700 

AGAATTTCGG GGCTTATCAG CAGAAGCAAT TATCCTTTAT TCCATACTTC TTAAACAGAC 11760 

AGGAATGTCA TTTAAGAATA ACTGGATAGA CAAGGAAGGC AGAGTATTTA TCTATTTTAC 11820 

TGTCGAAGAA ATTATGAAAA GAAGAAATAT CTCAAAGCCA ACTGCCATAA AAACATTAGA 11880 

TGAGCTTGAT GTAAAAAAGG AATAGGACTG ATCGAAAGAG TAAGGCTTGG ACTTGGTAAG 11940 

CCGAACATCA TTTATGTTAA AGACTTTATG AGTATATTTC AGGTAAAAGA AAATGACTTA 12000 

CAGAAGTCAA AAAACTTAAC TTCAGAAGTA AAAGATTTTA ACCTCAGAAG TAAAGAAAAT 12060 ' 

GAACTTCAAG AGGTTAAGAA CCTTGACTCT AACTATATAG AGAATAATAA GAGTAAGTAT 12120 

AGTAAGAGAG AATATAGTTT TGGTGAAAAC GGACTTGGAA CATTTCAAAA TGTGTTTTTA 12180 

GCTGCTGAAG ATATATCGGA TTTACAAATC ATAATGAACT CACAGCTTGA GAATTACATT 12240 

AGACTTCCTG CAAAACTAGA ATCCTAGTTC ATGATTGATA ATGCCAGCAA TCAAATTCAT 12300 

TCGTAATCCG AAGCGTTTAC GATGATTTCG ATAGATTGTT GAAAACATTT TAAACGTTTT 12360 

TACTTTGGCA AAGATGTTCT CAATCTTGCT TCTCTCCTTG GATAGCGCAT GGTTACAGGC 12420 

TTTATCTTCA GCTGTTAGCG GCTTGAGTTT GCTGGATTTA CGTGGAGTTT GTACTTGAGG 12480 

ATATATCTTC ATGAGCCCTT GATAACCACT GTCAGACAAG ATTTTACCAG CTTGTCCGAT 12540 

ATTTCTGCGA CTCATTTTGA ACAACTTCAT ATCACGACAA TAGTTCACAG CGATATCCAA 12600 

AGAAACAATT CTCCCTTGAC TTGTGACAAT CGCTTGAGCC TTCATAGCGT GAAATTTCTT 12660 

TTTACCAGAA TGATTCGCTA ATTCTTTTTT TAGGGCGATT GATTTTTACT TCCGTCGCAT 12720 

CAATCATTAC CGTGTCCTCA GAACTGAGAG GAGTTCTTGA AATCGTAACA CCACTTTGAA 12780 



WO 98/18931 



PCMJS97/19588 



281 



CAAGAGTTAC 


TTCAACCCAT 


TGGCTCCGAC 


GGATTAAGTT 


GCTTTCGTGA 


ATACCAAAAT 


12840 


CAGCCGCAAT TTGTTCATAA GTTCGATATT CTCGCACATA 


TTGAAGAGTG 


GCCATAAGAA 


12900 


GGTCTTCTAG 


GCTTAATTTA 


GGTTTTCGTC 


CACCTTTTGC 


GTGTTTAAGT 


TGATAAGCTG 


12960 


TTTTTAATAC 


AGCTAATATC 


TCTTCAAAAG 


TCGTGCGCTG 


AACACCAACA 


AGACGCTTAA 


13020 


ATCGTGCATC 


AGTTAGTTGT 


TTACTTGCTT 


CATCATTCAT 


AGAACTACTA 


TACCATATTT 


13080 


TGTTTCGCAG 


GAAGTCTATT 


GGAAAGTAAG 


AAATATTGAA 


GCTGAGGGTA 


TTAGAAGAAA 


13140 


TTGTGAGCGT 


GGTGCTATTT 


TTTCAGGTAA AATAAAATAT 


CACGAAGATT 


CACAGTTTAA 


13200 


AGGAGATCAC 


TATGTTGAAT 


GTTATGCTGT 


TTTAGATAAT 


ACGGTTATAG 


CAAGAGATAG 


13260 


AATAACAGTC 


CCTATCGATC 


CGTTATGTGG 


AAAAGATTTT 


ATAGAGTAGC 


ATATAATTGA 


13320 


TTCTTAACTG 


GAATACTCAC 


TATCTCTTTA 


CATCAAGAAA 


ATGACTAAAC 


AGGGAAGTTT 


13380 


GCCTTCTTCC 


CTTTTTTTGT 


TATACTAGTA 


GAAGAAAAAA 


TTAGAAAGAT 


TTGTGGGTGT 


13440 


CAAACAGCCC 


AGTGGGGTGT 


TTTAATATGG 


ACTTAGGTCC 


CACCCAAAGA 


GGTATTAGTG 


13500 


TCGTGTCTCA 


ATCTTATATC 


AATGTTATCG 


GTGCTGGTTT 


GGCAGGTTCT 


GAAGCAGCTT 


13560 


ACCAAATCGC 


AGAGCGTGGT 


ATTCCAGTTA 


AACTATATGA 


AATGCGTGGT 


GTGAAGTCTA 


13620 


CACCCCAGCA 


TAAAACAGAC 


AATTTTGCTG 


AGTTGGTTTG 


TTCCAATTCT 


TTGCGTGGGG 


13680 


ATGCTTTGAC 


AAATGCAGTT 


GGTCTTCTCA 


AGGAAGAAAT 


GCGTCGCTTG 


GGTTCTGTTA 


13740 


TCTTGGAATC 


TGCTGAGGCT 


ACACGTGTTC 


CTGCAGGTGG 


TGCCCTTGCA 


GTGGACCGTG 


13800 


ATGGTTTCTC 


TCAAATGGTG 


ACCGAAAAAG 


TTGCCAACCA 


CCCCTTGATT 


GAAGTGGTTC 


13860 


GTGATGAAAT 


TACAGAATTG 


CCGACAGATG 


TTATTACGGT 


TATCGCTACT 


GGTCCTTTGA 


13920 


CAAGTGATGC 


CTTGGCTGAA 


AAGATTCATG 


CTCTTAATGA 


CGGTGCTGGT 


TTTTATTTCT 


13980 


ACGATGCGGC AGCGCCTATT ATCGATGTCA ACACTATCGA TATGAGCAAG GTCTACCTCA 


14040 


AATCACGTTA 


TGATAAGGGA 


GAAGCGGCCT 


ACCTCAATGC 


CCCTATGACC 


AAGCAAGAAT 


14100 


TTATGGATTT 


CCATGAAGCT 


TTGGTCAATG 


CAGAAGAAGC 


ACCGCTTAGT 


TCTTTTGAAA 


14160 


AAGAAAAGTA 


CTTTGAAGGA 


TGTATGCCTA 


TCGAAGTCAT 


GGCCAAACGT 


GGCATTAAAA 


14220 


CTATGCTTTA TGGCCCTATG 


AAGCCAGTCG 


GTCTTGAGTA 


CCCAGACGAC 


TATACAGGAG 


14280 


CTCGTGATGG 


AGAATTTAAA 


ACACCTTATG CGGTTGTGCA 


ACTTCGTCAG 


GATAATGCAG 


14340 


CTGGTAGCCT CTACAATATT 


GTTGGTTTCC 


AGACCCACCT 


CAAATGGGGA 


GAACAAAAGG 


14400 


GTGTCTTCCA 


AATGATTCCG 


GGTCTTGAAA 


ATGCGGAGTT 


TGTCCGTTAT 


GGTGTGATGC 


14460 


ATCGCAATTC 


TTACATGGAT 


TCACCAAATC 


TTCTTGAGCA 


GACTTACCGT 


TCTAAGAAAG 


14520 
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AACCAAATCT 
CTTCAGGCTT 
TTTTCCCCGA 
AACATTTCCA 
TCCGTGATAA 
AATTTTTGAC 
ATTGTGATAA 
ACGTATTTTA 
TATCCAAACA 
AATTGCCCTT 
TATGGACCGT 
GATGGCAGAT 
GCAACAAGTG 
TATCGTTATC 
CCTTCGTGCA 
TGTTTACAAT 
CCGTGACGTT 
GGACAACGAC 
CGTATTTGGT 
AAGAATATGG 
TCACTTGCTC 
CGTGTACATG 
ATTCCAGAAG 
GAACGTGCCT 
CGCTTGGTTA 
AAGGTCGGCG 
GCTAAGAAAC 
GACATTCAAA 
GAGAAAGAAC 
AGTTTTATTC 



CTTCTTTGCT 
AGTTGCGGGA 
GACGACAGCG 
ACCAATGAAT 
GAAGGCTCGT 
TGTCTAATTT 
AATAGGTAGG 
ATCAAGTTAT 
GTTCAAACAA 
GTTATCGGTG 
GTTCAGGCAG 
TCATTGCAAC 
GCAGAGCCTT 
TTTGGTGCTG 
GCTGAAATCG 
GCCGATCCTA 
ATCAATAAAG 
ATTGACTTGG 
GAAAATATCG 
CTAACGCAAT 
GTGAATTTGG 
TAGAATACTA 
CGCGTGTTTT 
TGAACGCTTC 
TCCCAGCTCT 
AAAATGCTAA 
GAGAAAAAGC 
AAGTAACAGA 
TTTTGGAAGT 
GAAAGAAGGA 
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GGTCAAATGA CGGGTGTGGA AGGCTATGTT 
ATTAACGCAG CTCGTCTCTT CAAGGAAGAA 
ATTGGAAGCT TAGCTCATTA CATTACCCAT 
GTCAATTTTG GGATCATCAA GGAGTTGGAA 
TATGAAAAAA TTGCAGAGCG TGCCCTTGCC 
TTTTGAAAGA ATTGCTCATG ATACTATAAA 
ATGAAAGAAG GAGAGTGAAA ATGGCGAATC 
CAGGTGAAGC CCTTGCCGGT GAACGTGGCG 
TCGCAAAAGA GATTCAAGAA GTTCATAGCT 
GAGGAAATCT CTGGCGTGGA GAACCTGCAG 
ATTACACAGG AATGCTTGGG ACTGTTATGA 
AAGTTGGGGT TGATACGCGT GTACAAACAG 
ATGTCCGTGG ACGTGCCCTT CGTCACCTTG 
GAATTGGTTC ACCTTACTTC TCGACAGATA 
AAGCAGATGC CATCCTCATG GCTAAAAATG 
AGAAAGATAA GACAGCTGTT AAGTTTGAAG 
GTCTTCGTAT CATGGACTCA ACAGCTTCAA 
TTGTATTCAA CATGAACCAA CCAGGCAACA 
GAACAACAGT TTCAAATAAT ATCGAAGAAA 
TATTGAAAAA GCTAAAGAGA GAATGACCCA 
TGGTATCCGT GCTGGTCGTG CCAATGCAAG 
TGGAGTCGAA ACTCCTCTTA ACCAAATCGC 



GTTGGTAACA CCATTTGACA AGTCTTCATT 
TGATATTGGT ATCACACCGG CTAATGACGG 
TACAGAAGAA ACTCGTCGTG ACCTTGCTAA 
AGTGGCTGTC CGCAATATCC GTCGCGATGC 
AAAAGAAATC ACTGAAGACG AATTGAAGAC 
CGATGCTGTT AAACACATCG ACGACATGAC 
CTAAAAATAA ACAGAAAAAC TCAGTTGGCA 
AATATGAATA CAAATCTTGC AAGTTTTATC 



GAGTCGGCGG 
AGCGAGGCTA 
GCCGACAGCA 
GGCGAGCGTA 
GACTTAGAGG 
AATCTTAGAA 
CCAAGTATAA 
TAGGGATTGA 
TAGGTATCGA 
CAGAAGCAGG 
ATGCTCTTGT 
CTATTGCCAT 
AAAAAGGCCG 
CAACAGCGGC 
GTGTCGATGG 
AATTGACCCA 
CCCTCTCAAT 
TCAAACGTGT 
AGGAATAAGA 
GTCTCACCAA 
CTTGCTTGAC 
TTCAATTACG 
GAAAGACATC 
TTCTGTGATT 
AGAAGTGAAG 
TATGGACGAA 
TCTTGAAAAA 
TGCTAACAAA 
TTGCTGGCTG 
GTTGGACTGA 



14580 

14640 

14700 

14760 

14820 

14880 

14940 

15000 

15060 

15120 

15180 

15240 

15300 

15360 

15420 

15480 

15540 

15600 

15660 

15720 

15780 

15840 

15900 

15960 

16020 

16080 

16140 

16200 

16260 

16320 
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TCATCGATGA AAACGACCGT 


TTTTACTTTG TGCAAAAGGA 


TGGTCAAACC 


TATGCTCTTG 


16380 


CTAAGGAAGA AGGCCAACAT ACAGTAGGGG ATACGGTCAA AGGTTTTGCA TACACGGATA 


16440 


TGAAGCAAAA ACTCCGCCTG 


ACAACCTTAG AAGTGACTGC 


CACTCAGGAC 


CAATTTGGTT 


16500 


GGGGACGTGT CACAGAGGTT 


CGTAAGGACT TGGGTGTCTT 


TGTGGATACA 


GGCCTTCCTG 


16560 


ACAAGGAAAT CGTTGTGTCA 


CTCGATATTC TCCCTGAGCT 


CAAGGAACTC 


TGGCCTAAGA 


16620 


AGGGCGACCA ACTCTACATC 


GGTCTTGAAG TGGATAAGAA 


AGACCGTATC 


TGGGGCCTCT 


16680 


TGGCTTATCA AGAAGACTTC 


CAACGTCTTG CTCGTCCTGC 


CTACAACAAC 


ATGCAGAACC 


16740 


AAAACTGGCC AGCCATTGTT TACCGTCTCA AGCTGTCAGG 


AACTTTTGTT 


TACCTACCAG 


16800 


AAAATAATAT GCTTGGTTTT 


ATTCATCCTA GCGAGCGTTA 


CGCAGAGCCA 


CGTTTGGGGC 


16860 


AAGTATTAGA TGCGCGCGTT 


ATTGGTTTCC GTGAAGTGGA 


CCGCACTCTG 


AACCTCTCCC 


16920 


TCAAACCACG CTCCTTTGAA 


ATGTTGGAAA ACGATGCTCA 


GATGATTTTG 


ACTTATTTGG 


16980 


AAAGCAATGG CGGTTTCATG 


ACCTTAAATG ACAAGTCATC 


TCCAGACGAC 


ATCAAGGCAA 


17040 


CCTTTGGCAT TTCTAAAGGT 


CAGTTCAAGA AAGCTTTAGG 


TGGTCTTATG 


AAGGCTGGTA 


17100 


AAATCAAGCA GGACCAGTTT 


GGGACAGAGT TGATTTAGGG 


AGGCTTATGA 


GAAAATCATT 


17160 


TTACACTTGG CTCATGACCG 


AGCGCAATCC TAAAAGTAAC 


AGTCCCAAAG 


CAATTTTGGC 


17220 


AGACCTCGCT TTTGAAGAGT 


CAGCCTTTCC AAAACACACA 


GATGATTTTG 


ATGAGGTCAG 


17280 


TCGCTTTTTG GAGGAGCATG 


CCAGTTTCTC TTTTAACCTA 


GGAGATTTTG 


ACAGCATTTG 


17340 


GCAGGAATAT CTAGAACACT 


AGCATTTATT CATTGGGTTT 


GGGCTAGTAA 


TTTCTCCATC 


17400 


CCTCTGCTAT AATAAAAAGA 


AATAAAAGGA TTAGAGAGGT 


TCTTTATTTG 


AAGGAACATT 


17460 


CAATAGACAT TCAACTGAGT 


CATCCAGATG ACCTGTTTCA 


TCTTTTTGGT 


TCCAATGAAC 


17520 


GCCATCTTCG TTTGATGGAA 


GAAGAGCTTG ATGTTGTGAT 


TCATGCTCGT 


ACGGAGATTG 


17580 


TCCAGGTTTT GGGAGAAGAG 


TCTGCCTGTG AGGAAGCCCG 


TCAAGTTATT 


CAGGCTTTGA 


17640 


TGGTCTTGGT AAATCGTGGG 


ATGACCGTTG GTACGCCAGA 


TGTAGTCACT 


GCGATTAGCA 


17700 


TGGTCAAAAA TGATGAAATT 


GACAAGTTTG TCGCCCTTTA 


CGAAGAAGAA 


ATTATCAAGG 


17760 


ATAATACTGG GAAACCTATC 


CGTGTCAAAA CCCTAGGGCA 


AAAGCTTTAT 


GTGGACAGTG 


17820 


TCAAACAGCA TGATGTGACC 


TTTGGAATTG GGCCAGCAGG 


TACAGGGAAG ACCTTCCTTG 


17880 


CAGTGACCTT GGCAGTGACT GCCCTTAAAC GTGGGCAAGT CAAGCGAATT ATCCTAACTC 


17940 


GTCCAGCGGT GGAAGCGGGA GAGAGTCTTG GATTTCTTCC GGGTGATGTT 


AAGGAGAAGG 


18000 


TGGATCCTTA CCTTCGTCCT 


GTTTACGATG CCTTGTATCA 


AATTCTTGGG 


AAAGACCAAA 


18060 
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CGACTCGTCT 
GGACCTTGGA 
TGAAGATGTT 
GTCAGATTGA 
AGAACATCCA 
TTGTCGCTCA 
GAGGAAGTTC 
TTTTATGGAA 
GATTGGTGTC 
GATTCAGCAC 
GACCCTCCGT 
CGGTGTAGCT 
TAGCTTTGAT 
AAGTCCGATT 
CACACCTTAA 
TTTTCTGACG 
CCTGAAACGA 
AATACTTTCA 
TTAAATAAAG 
ACCTTCCAAA 
GAATTAATTT 
CTTTTGAGAG 
ACGATTAGGC 
TGTCATAATC 
AGTATTACAG 
AAAGTAATCC 
AGGACATGCT 
GGAATGGATC 
CTGTTTTTAT 
ATATTGTGAC 



CATGGAGCGT GAAATTATCG 
TGATGCCTTT GTCATTCTCG 
CTTGACGCGT TTAGGTTTTC 
CCTGCCACGT AATGTCAAGT 
TCAGATTGAC TTTGTTCATT 
GATTATCCGA GCCTATGAAT 
GCCTGCAAAA GAATAGACTT 
ACAGTATACG ACAAAGCACA 
AAAAAGGAAA CCTTTCAACT 
CGAAAAGGTG GACGTCCACG 
TACTTGCGAT ATTATCCCAC 
ACGGTAAATG CCATCATCAC 
TTGGACCATT TAGAAGCCCC 
CAGCGTCCAA ACAAAACCAA 
AAACTCAAAT TATGCTGGAT 
GACATACGCA TGATTTTACT 
CGCTTGCCTT TGTTGACCTA 
TTCCTGCTAA AAATTCCAAA 
AGATGTCAGC GATACGAATT 
TCATGTCAGT CCCTTATCGT 
GTGCCATCAT CAATTATGAA 
AGGAAAATCC AGTTGTATAG 
ACGATGGAAA GAACTTTTAT 
ACAGGGCACA AGAAAGTAGG 
TTGTAGGATA CTAACTGAAA 
TCTGTATTTT CAGCATTGTC 
TTGTCTACCT GAAGGTAAAG 
TGACCTTGTG GCTGTTATGG 
TGGTTTGTTT ATGGTTGATC 
AGAAGCACTA GCTTATTTTG 
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AAATTGCGCC CCTTG CCT AT 



ATGAGGCGCA AAACACGACC 
ATTCTAAGAT GATTGTCAAT 
CCGGTTTGAT TGATGCTCAA 
TTTCAGCCAA GGATGTGGTT 
ATTCTACTGA AGTTGCACAC 
GTTCGGTAAC TGTAAAAAGT 
AAAACTTAAC TCAAAAAACT 
CATGCTAGAA CACCTGAATT 
TAGTCTGCCC ATGGAAGACC 
TCAGCGTCTG CTGGCCTTTG 
TTGGGTGGAG GATACACTTC 
GAGTGCTGCT GTGGCTATTG 
AGCAAAAATT ATTCTGGTAA 
TTGACGACAC ATAAAGTCTG 
CTCTTCAAAG AAAGTATTGG 
GGTTATTTAG GCATCTTGAA 
AATCGCCGCC TGAGTGAGGA 
GAAATTGAAC ATTTTAACGC 
AACCGCAGAA AACGTTTCGA 
GTGAACTAGA TTCCGAACAA 
GCTAAAGGTT TTATCCAAAG 
GTGGCTGATG ACGATCAGTG 
AATTTGAAAA GATGATTGAC 
AGGATATTCC AAGTATTTTA 
CACCAGAGCC AAATTTTGCA 
CTAAGGCTGA TAAGTTTTTT 
ATTTTGTCTA TGCATATCCT 
AAGCCTATCA GAGAAAAGGG 
CTAAGAACTT TCGAAAGGCA 



ATGCGTGGCC 
ATCATGCAGA 
GGAGATATTA 
GAGAAACTCA 
CGCCATCCTG 
GACTGATTTT 
GTTATACTAT 
TCAAACTATT 
CAGCCTATCA 
AGCTCATTAT 
ATTTTGGCGT 
GTGCGTCAGG 
ACGTGACCGA 
AAAGAAACGA 
TCAAATGGCC 
ACAAAGTTTG 
ATTTCATGAG 
TGATAAGCAG 
TAAATTCAAG 
GTTACGGGCG 
GTCTAATATA 
GTCTGAGACA 
CATCTTCCTG 
CAACTATCTA 
TCTTTATATG 
ACTGTAAAAG 
GTTGGATTTT 
GATGAGGAGA 
ATTGGTAGTC 
CGTTTGGCTT 



18120 

18180 

18240 

18300 

18360 

18420 

18480 

18540 

18600 

18660 

18720 

18780 

18840 

18900 

18960 

19020 

19080 

19140 

19200 

19260 

19320 

19380 

19440 

19500 

19560 

19620 

19680 

19740 

19800 

19860 
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ATGTTAAGGG 


AAATCCGCAA 


TCTCAGCATT 


TTTGGGAAAA 


GCAGGGCTTT 


AAATCAATTG 


19920 


GATGCGAGGT 


TAAGCAAGAA 


CTCTATACGG 


TTGTTATCGC 


TGAACAGAGC 


CTAGAAGATT 


19980 


AGAAATGGCA TCAAGTAAGA ACTATTTGGA ATTTGTTTTG GAACAATTAT CAGGATTAGA 


20040 


TGATGTGACT 


TACCGTTCCA 


TGATGGGGGA 


GTATATTCTT 


TACTTCCGCG 


GCAAGATTAT 


20100 


TGGCGGCATT 


TATGACGATC 


GCTTTTTAGT 


TAAACCCGTG 


CAAGCAGTCT 


TAGATAAGAT 


20160 


TGACCAATCT 


TCTTTTGAGT 


TTCCATACAA 


AGGTGCCAAA 


GAAATGATTT 


GAGTGGAAGA 


20220 


ACTTGATAAT 


AAGATGTTTC 


TATAAGACCT 


AATTTTAGCT 


ATGTATAACC 


AACTGCCAAC 


20280 


GCCCAAACCT 


AAAAAGAAAA 


AGCAAGGGTG 


AACGAAGTAA 


AAAAGAAGTC 


TGCTAAGGCC 


20340 


CTGTCTTTGC 


ACGGGTAAAA 


TTTTATATAT 


AAAAAGAAGC 


TGGGACTAAA 


GAGCTCAGCT 


20400 


TCCTTTGGTT 


TATATAATTG 


TCATTACAAG 


ACGAAGTGGT 


TGGGCGAAAC 


TCTGTTGACT 


20460 


TTATTCAATT 


TAGAGTTTCT 


TATGCACAAT 


TGAGTCTGGA 


ACGAAAGTCT 


CCAGTTGCAA 


20520 


AGTATACAGT 


ACAATAAACC 


AACGATGTAA 


TAGCTGATGA 


CACAAAGCAC 


AGTGGGTAGG 


20580 


ACTTGCGAAG 


TCACCCTTTT 


CTTTTCAAAA 


TTTATACTAA 


ATCATTGATA 


TCAGTGTAGT 


20640 


CACGATTAAG 


TCCTTGAGCA 


ACTGGTAGGT 


TAGTCAAGTA 


ACCTTGATAA 


GTAGTCACAC 


20700 


CTTGACGCAA GCCTTCATCT 


TCAGAGATTG 


CTTGTGCGAA 


TCCTTTGCCA 


GCCAAAGCTT 


20760 


CGATATAAGG 


AAGAGTGACA 


TTGGTTAGGG 


CGATGGTTGA 


AGTGCGAGCA 


ACCGCACCAG 


20820 


GGATATTGGC 


AACGGCATAG 


TGGAGAACAC 


CGTGTTTTTC 


ATAGACGGGT 


TCATCGTGCG 


20880 


TTGTCACACG 


GTCAGCTGTT 


TCGATAACGC 


CACCTTGGTC 


AACAGCAACG 


TCAACGATAC 


20940 


AGAGCCTGGA 


CGCATTTGTT 


TGACCATCTC 


ATCTGTCACC 


AATTCCGGTG 


CTTTTGCACC 


21000 


AGGGATGAGA 


ATGGCTCCAA 


TCACCACATC 


AGCATCTCTC 


ACACTTGCTT 


CAATGTTGAA 


21060 


TGAATTAGAC 


ATAAGAGTTT 


GAATTTGACT 


TCCAAAGACT 


TCTTCTAGAA 


CTGAGAGACG 


21120 


CTTGGAACTA 


ATATCTAAAA 


TAGTCACTTG 


AGCACCAAGA 


CCAAGGGCGA 


TGCGGGCAGC 


21180 


ATGTGTACCG 


ACGACACCAC 


CACCGATGAT 


AGTTACTTTT 


CCTTTTGGAA 


CACCTGGTAC 


21240 


ACCACCAAGT 


AGAACACCAG 


AGCCACCAGC 


TTGCTTAGTA AGGAAGTGAG 


CTCCGATTTG 


21300 


AACAGCCATA 


CGACCTGCAA 


CCTCACTCAT 


AGGAACGAGG 


AGCGGTAGTT 


GTCCTTGATT 


21360 


GTCACGAACA 


GTTTCAGTTG 


TTTTTGCTGT 


TAACATAGCA 


TCTGCTAATT 


CTGGAGCAGC 


21420 


GGCCATGTGC 


AAGTAGGTGA 


AGAGAAGAAG 


ATCGTCGCGC 


AAGTAACCGT 


ATTCAGAACT 


21480 


TAAAGATTCT TTTACTTTCA CAACCAACTC TGCTGCCCAA GCTTCACCAG CAGTAGCGAC 


21540 


AATCTCAGCT 


CCTTGCTTTT 


GATAGTCAGC 


ATCAGTAAAG 


CCAGAACCGA 


GACCAGCATT 


21600 
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TGTTTCGATA AGGACACGAT GACCACGACT AACTAAGCTA TGAACACCTG CAGGTGTGAG 21660 

GGCGACACGG TTTTCGTTAT TTTTAATTTC TTTTGGGATT CCGATTAACA TTGAGATAAC 21720 

CTACCTTTCA ATTGACGGTC TTGTTTTGGT TGTCACATTC CAGTTCATAA ATCAAAAATG 21780 

TGACGGTTTC ATTGTATATG AAACCGCTTC AAAAATCAAG AAAAACTTGT CATCCAAATT 21840 

TTTTTATGCT AGACTAGTGA AAATCAAGCT CTAATGGAGG GAAAAGTATG GAATCAATAT 21900 
TTGTGAAATT TGCCCAGTAT CCGTCTATAG AAACGGAGCG TTTATTGCTC AGACCTGTAA ' 21960 

CTTTGGATGA TGCGGAAcAA TGTTTGACTA TGCCTCGGAC AAGGGTAATA CACGTTACAC 22020 

TTTTCCAACC AATCAAAGCT TGGAAGAAAC CAAGAATAAC ATTGCTCAGT TCTACTTGGC 22080 

TAATCCCTTG GGACGTTGGG GAATAGAACT AAAAAGCAAT GGTCAGTTTA TTGGAACCAT 22140 

TGACTTGCAC AAGATTGATT CTGTTCTTAA GAAGGCAGCT ATTGGCTACA TTATCAATAA 22200 

AAAGTATTGG AATCAAGGAT TAACGACAGA AGCCAATCGT GCTGTGATTG AGCTAGCTTT 22260 

TGAGAAGATA GGGATGAATA AGTTGACTGC CCTTCACGAT AAGGCTAATC CCGCGTCAGG 22320 

AAAGGTCATG GAGAAATCAG GCATGCGTTT TTCCCATGCA GAACCATATG CTTGTATGGA 22380 

CCAGCATGAA AAAGGCCGAA TCGTGACAAG AGTTCATTAT GTCTTGACCA AGGAAGACTA 22440 

TTTTGCAAAT AAATAAGCAG TTGAAAAGAA ATTTTTCGAC TGTTTTTTCT TCCTCTTACG 22500 

AATAATCTAA GAGAGGAGAA AATATGGAAG CAATTATCGA GAAAATCAAA GAGTATAAAA 22560 

TCATCGTCAT CTGTACTGGT CTGGGCTTGC TTGTAGGAGG ATTTTTCCTG CTAAAACCAG 22620 

CTCCACAAAC ACCTGTCAAA GAGACGAATT TGCAGGCTGA AGTTGCAGCT GTTTCCAAGG 22680 

ACTCATCGAC CGAAAAGGAA GTGAAGAAGG AAGAAAAGGA AGAACCCCTT GAACAAGATC 22740 

TAATCACAGT AGATGTCAAA GGTGCTGTCA AATCGCCAGG GATTTATGAC TTGCCTGTAG 22800 

GTAGTCGAGT CAATGATGCT GTTCAGAAGG CTGGTGGCTT GACAGAGCAA GCAGACAGCA 22860 

AGTCGCTCAA TCTAGCTCAG AAAGTTAGTG ATGAGGCTCT GGTTTACGTT CCTACTAAGG 22920 

GAGAAGAAGC AGTTAGTCAA CAGACTGGTT CGGGGACAGC TTCTTCAACA AGCAAGGAAA 22980 

AGAAGGTCAA TCTCAACAAG GCCAGTCTGG AAGAACTCAA GCAGGTCAAG GGACTGGGAG 23040 

GAAAACGAGC TCAGGACATT ATTGACCATC GTGAGGCAAA TGGCAAGTTC AAGTCAGTAG 23100 

ACGAGCTCAA GAAGGTCTCT GGCATTGGTG GCAAAACAAT AGAAAAGCTT AAAGACTATG 23160 

TTACAGTGGA TTAAGAATTT CTCTATTCCC CTAATTTACC TGAGTTTTCT ATTACTTTGG 23220 

CTTTATTACG CTATTTTCTC AGCATCTTAT CTTGCTTTGT TGGGCTTTGT TTTTCTGCTA 23280 

GTCTGTCTCT TTATCCAATT TCCGTGGAAA TCTGCTGGTA AAGTTCTAAT AATTTGCGGA 23340 

ATCTTTGGAT TTTGGTTTGT TTTTCAAAAT TGGCAACAGA GTCAAGCGAG TCAAAATCTG 23400 
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GCGGATTCTG TTGAAAGGGT ACGGATTTTG CCTGATACTA TTAAGGTTAA TGGTGATAGT 


23460 


CTATCCTTTC GTGGCAAGTC 


TAACGGTCGT 


GCTTTCCAAG 


TCTATTATAA 


ACTCCAGTCC 


23520 


GAGGAGGAGA 


AAGAAGCCTT 


TCAAGCTTTA 


ACTGACCTGC 


ATGAGATAGG 


ACTAGAAGGG 


23580 


AAGCTTTCGG 


AGCCAGAAGG 


GCAGAGAAAT 


TTTGGTGGCT 


TTAATTAGCA 


AGGCTATCTG 


23640 


AAGACTCAGG 


GAATTTACCA 


GACTCTCAAT 


ATCAAAACAA 


TCCAGTCAGT 


TCAAAAGATT 


23700 


GGCAGTTGGG 


ATATAGGAGA 


AAACTTGTCC 


AGTTTACGTC 


GAAAGGCTGT 


GGTTTGGATT 


23760 


AAGACGCACT 


TTCCAGACCC 


TATGGGCAAT 


TACATGACAG 


GACTCTTGCT 


GGGACATCTG 


23820 


GAGACCGACT 


TTGAGGAGAT 


GAATGAGCTT 


TATTCCAGTC 


TAGGAATTAT 


GCACCTCTTT 


23S80 


/ G CCCT ATCTG 


GCATGCAGGT 


AGGTTTTTTC 


ATGAATGGAT 


TTAAGAAACT 


TCTCTTGCGA 


23940 


TTGGGCTTGA 


CCCAAGAAAA 


GTTGAAATGG 


CTGACTTATC 


CCTTTTCCCT 


TATCTATGCG 


24000 


GGACTAACTG 


GATTTTCAGC 


ATCGGTTATT 


CGCAGTCTCT 


TGCAAAAGCT 


ACTGGCTCAA 


24060 


CATGGGGTTA AGGGCTTGGA TAATTTTGCC TTGACGGTGG TTGTCCTGTT 


TATTGTCATG 


24120 


CCAAACTTTT 


TCTTGACAGC 


AGGAGGAGTC 


TTGTCCTGCG 


CTTATGCTTT 


TATCCTGACC 


24180 


ATGAGCAGCA AAGAAGGGGA 


GGGGCTCAAG 


GGTGTTACTA 


GTGAAAGTCT 


AGTCATCTCC 


24240 


TTGGGCATAT 


TGCCCATTCT 


ATCCTTCTAT 


TTTGCGGAAT 


TTGAACCTTG 


GTCTATCCTT 


24300 


TTGACCTTTG 


TCTTTTCCTT 


TCTTTTTGAC 


TTGGTCTTCT 


TACCGCTCTT 


GTCTATCTTA 


24360 


TTTGTCCTTT 


CCTTTCTCTA 


TCCAGTCATT 


CAGCTGAACT 


TTATCTTTGA 


ATGGTTAGAG 


24420 


GGCATTATTC 


GCTTGGTCTC 


GCAGGTGGCA 


AGGAGACGAC 


TTGTCTTTGG 


TCAACCCAAC 


24480 


GCATGGCTTT 


TAATCTTATT 


GTTAATTTCC 


TTGGCTTTGG 


TCTATGATTT 


GAGGAAAAAC 


24540 


ATTAAAGGAT 


TAACAGTATT 


GAGTTTATTG 


ATTAGAGGTC 


TCTTTTTCCT 


TACCAAGTAT 


24600 


CCACTGGAAA ATGAAATCAC CATGCTGGAT 


GTGGGGCAAG 


GAGAAAGTAT 


TTTCTACGGG 


24660 


ATGTAACTGG 


GAAAACGATT 


CTCATAGATG 


TAGGTGGTAA 


GGCAGAATCT 


TATAAGAAAA 


24720 


TCAAAAAATG 


GCAAGAAAAG 


ATGACGACCA 


GCAATGCCCA 


GGGAACCTTG 


ATTGCGTATG 


24780 


TCAAAAGTCG 


AGGAGTAGCT 


AAGATTGACC 


AGCTAATTTT 


GACTAACACG 


GACAAGGAGC 


24840 


ATGTTGGAGA 


TTTGTCAGAG 


ATGACCAAGG 


CTTTCCATGT 


AGGGGAGATT 


CTAGTATGAA 


24900 


AAGACAGTCT 


GAAACAGAAG 


GAATTTGTGG 


CAGAACTACA 


GGCGACTCAA 


ACAAAGGTGC 


24960 


GTAGTATGAT 


AGTAGGGGAG 


AACTTGCCCA 


TTTTTGGAAG 


TCAGTTAGAA 


GTTCTATCTG 


25020 


CAAGGAAAAT 


GGGAGATGGA 


GGACACGATG 


ATACGCTAGT 


TCTGTATGGG 


AAATTCTTGG 


25080 


ATAAGGAATT 


TCTGTTCAGG 


GGAAATTTGG 


AGGAGAAAGG 


AGAGAAGGAG 


TTGCTGAAGC 


25140 
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ACTATCCAGA CTTGAAAGTA AATGTTTTGA AAG CTAGCC A ACATGGCAAT AAAAAATCAT 25200 

CAAGTCCAGC CTTTCTAGAA AAACTCAAAC CAGAGCTTAC TCTTATCTCA GTTGGAAAGA 25260 

GCAATCGAAT GAAACTCCCC CATCAGGAAA CATTGACACG ACTGGAAGGT ATCAATAGCA 25320 

AAGTTTATCG AACTGACCAG CAAGGAGCTA TACGTTTTAA GGGGTTGGAT AGTTGGAAAA 25380 

TCGAAAGTGT TCGATAGGAA GGATAAATGT TGTAGATTAG TGAAATAAAC TAAAAATTTG 25440 

TTGCATAATA ATGATAAAAA TGGTATAATG AAAACGTATT CAATATTGAG GATATAAAAT 25500 

CATTAAAAAT CAGCAAAAGT TGTTTTATTA GTTAGTTTAT AATCTATTGG TCTTCTTCAG 25560 

TCCAGTGTAT CTGCTGTGAC AGTCACTAAA AGTTACAAGT ATGATTGGAA TACGGTTTGG 25620 

GAATATAGTA CCAACTATCA CGACCATCAG TATGCTTGGA TTCCGTCATG GTCTCGTTAT 25680 

GACAGCTATT CTGAGTATAA AGTTGGCGGA GGCTGGAACT ACGCTCGTTA TGAGGTCATA 25740 

AACTATTACA GCGGAGGCTA TTAATTCTTA AAGAGTGAGA AAAAGGAGGG CTAGATATGT 25800 

TGCAGCTTAC TCATGTGACC TTAAAAACGC GACAAGTCAT CTTGCAAGAT GTGGATTTC A 25860 

CCTTTAAAAA GGGTAGGGTT TATGGTCTTC TTGCTATCAA TGGCTCTGGA AAGACGACCC 25920 

TGTTCCGTGC CATTAGCAAT TTAATTCCCA TAAGTAGTGG AAATATCGCA GCCCCTCCTT 25980 

CTTTATTTTA TTATGAGAGT ATTGAATGGC TGGATGGAAA CTTAAGTGGG ATGGACTACC 26040 

TTCGTCTTAT CAAAAACATC TGGAAGTCAG GTCTGAACTT GAGGGATGAA ATCGCCTATT 26100 

GGGAAATGTC TGACTATATC AGTCTTCCCA TTCGCAAGTA TTCCTTAGGC ATGAAGCAAC 26160 

GCTTGGTGAT TGCCATGTAT TTCCTCAGTC AGGCCAAATG CTGGCTCATG GATGAGATTA 26220 

CAAATGGCTT AGATGAGTAT TATCGACAGA AGTTTTTTGA TAGGCTAGCA CAAATCGATA 26280 

GACAAGAACA GCTGGTTCTT TTAAGTTCCC ACTATAAGGA AGAGTTGGTT GATGTCTGCG 26340 

ATAGAGTAGT AACCATTCAT CAGGGGCAGA TAGAAGAGGT TTAGTTTATG AAAGATGTTA 26400 

GTCTATTTTT ATTGAAAAAA GTTTTCAAAA GCCGCTTAAA CTGGATTGTC TTAGCTTTAT 26460 

TTGTATCTGT ACTCGGTGTT ACCTTTTATT TAAATAGTCA GACTGCAAAC TCACACAGCT 26520 

TGGAGAGCAG GTTGGAAAGT CGCATTGCAG CCAACGAGAG GGCTATCAAT GAAAATGAAG 26580 

AGAAACTCTC CCAAATGTCT GATACCAGCT CGGAGGAATA CCAGTTTGCT AAAAATAATT 26640 

TAGACGTGCA AAAAAATCTT TTGACGCGAA AGACAGAAAT TCTGACTTTA TTAAAAGAAG 26700 

GGCGCTGGAA AGAAGCCTAC TATTTGCAGT GGCAAGATGA AGAGAAGAAT TATGAATTTG 26760 

TATCAAATGA CCCGACTGCT AGCCCTGGCT TAAAAATGGG GGTTGACCGC GAACGGAAGA 26820 

TTTACCAAGC CCTGTATCCC TTGAACATAA AAGCACATAC TTTGGAGTTT CCGACCCACG 26880 

GGATTGATCA GATTGTCTGG ATTTTAGAGG TTATCATCCC AAGTTTGTTT GTGGTTGCTA 26940 
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TTATTTTTAT GCTAACACAA CTATTTGCAG AAAGATATCA AAATCATCTG GACACAGCTC 27000 

ACTTATATCC TGTTTCAAAA GTGACATTTG CAATATCCTC TCTTGGAGTT GGAGTGGGAT 27060 

ATGTAACTGT GCTGTTTATC GGAATCTGTG GCTTTTCTTT TCTAGTGGGA AGTCTGATAA 27120 

GTGGTTTTGG ACAGTTAGAT TATCCCTACC CAATTTATAG CTTAGTGAAT CAAGAAGTAA 27180 

CTATTGGGAA AATACAAGAT GTATTATTTC CTGGCTTGCT CTTAGCTTTC TTAGCCTTTA 27240 

TCGTCATTGT GGAAGTTGTG TACTTGATTG CTTACTTTTT CAAGCAAAAA ATGCCTGTCC 27300 

TCTTTCTTTC ACTCATTGGG ATTGTTGGCT TATTGTTTGG TATCCAAACC ATTCAGCCTC 27360 

TTCAAAGGAT TGCACATCTG ATTCCCTTTA CTTACTTGCG TTCAGTGGAG ATTTTATCTG 27420 

GAAGATTACC TAAGCAGATT GATAATGTCG ATCTAAATTG GAGCATGGGA ATGGTCTTAC 27480 

TTCCTTGCCT GATTATCTTT TTGCTATTGG GAATTCTATT TATTGAAAGA TGGGGAAGTT 27540 

CACAGAAAAA AGAATTTTTT AATAGATTCT AGCTTTCCTA TAGGTAGGGA AAATAAGTAA 27600 

AAACTAACAT AGAGAGGGAA TCAACTTGAT TCTCTCTTTT TGATTCGAAA ACCAAACCAA 27660 

AATACAAACA CAAACTTTTC AAAAAATAAC TTTTTATCTT GACAAGAGCT AGAAAACTTG 27720 

GTATCATATA AAAGTTGAGA AAAGCAGAAG TGAGAGCTTC TCGCCTTGTG ACATTAAGTT 27780 

GCCTGGCCCT ACGGATGAAA AGTTTCGAAG AAACGCTATC ATAACGTGCG GGCTTGTATA 27840 

TTTACAAGTC CGCTATTGTT TTTCTCTAAT AAAACAAAAG AGGTGAAAAC CATAGCAAAG 27900 

CAAGACTTAT TCATCAATGA TGAGATTCGT GTACGTGAAG TTCGCTTGAT TGGTCTTGAA 27960 

GGAGAACAGC TAGGTATCAA GCCACTCAGT GAAGCGCAAG CTTTGGCTGA TAACGCTAAT 28020 

GTTGACCTAG TATTGATTCA ACCCCAAGCC AAACCGCCTG TTGCAAAAAT TATGGACTAC 28080 

GGTAAGTTCA AATTTGAGTA CCAGAAGAAG CAAAAAGAAC AACGTAAAAA ACAAAGCGTT 28140 

GTTACTGTGA AAGAAGTTCG TCTAAGTCCG G 28171 
(2) INFORMATION FOR SEQ ID NO: 23: 

(X) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7147 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
CCGCTCAACT TTTGCAATCA AGGCTAAGTA GACAGCAGCA AATTTCATAT TGTATAATTT 
CTGACTCATA CTTCTCTCTT TCTATGTGTA CTAGTATAAA TAAGAAAAAG AAGGCCGTCA 



60 



120 
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AGCCTTCTTT TGATTTATTC TTCTGCTTCA TCTTCTGTAA ATTGACTATT GTACAAGTCA 180 

GCGTAGAAGC CACCTTGCGC CATCAGTTCC TCATAGTTGC CTTGCTCGAT GATATTTCCA 240 

TCTTTCATGA CCAAGATCAA GTCTGCATTT CGGATGGTTG ACAAGCGGTG GGCAATGACA 300 

AAGGATGTGC GTCCTTCCAT CAAACGGTCC ATGGCTTTTT GGATCAATTC CTCTGTCCGT 360 

GTGTCAACAG AAGAAGTCGC CTCATCCAAA ATCAAAAGCG GTGCATCCTT AAGAAGGGCA 420 

CGAGCAATAG TCAATAGTTG TTTTTGTCTT ACAGACAAGG TCACGGTGTC ATCCAAGATG 480 

GTATCATAGC CATCTGGCAA GGTCATAATA AAGTGGTGAA TTCCCACAGC CTTACTAGCT 540 

TCCATCATTC GTTCATCACT AATCCCTATT TGATTATAGA TGAGATTGTC TCGAATAGTT 600 

CCTTCAAAGA GCCAGGTATC CTGCAAGACC ATTGAAAAGG CATCATGCAC TTCTGAACGC 660 

GTCATAGCCT TGGTATCCAC ACCATCAATG CGAATACTTC CCTTATCAAT CTCATAGAAT 720 

TTCATCAAAA GATTGACAAT GGTTGTCTTA CCAGCCCCAG TCGGCCCAAC AATGGCAACC 780 

TTTTGACCAG CATGAGCTGT CGCAGAGAAG TCATAGTCTT GAACATTGAC ACCGTCCACC 840 

AGAATTTCTC CTGCTGACAC GTCGTAGAAA CGTGGAATCA GATTGACCAG AGTTGATTTA 900 

CCAGAACCTG TTGACCCAAT AAAGGCCACT GTTTGACCAG TTTCTGCTTT AAAGCTAACA 960 

TGTTCAATAA CTGCCTCCGA ATTTGCCGCA TAGCGgAAGG TCACATCCTT AAACTCGACC 1020 

TGACCTTTGA AGTTTTCATC AGTCAGCTGC ACTTGAACAG GGTTTTGGAT AGAAGAATGC 1080 

AAATCTAAAA CTTGATTAAT CCGCTTAGCA GAGACCATAG TTCGGGGAAG AACGATGAAG 1140 

AGTGCTCCCA TGAGAAGGAA GCCCATGACA ACCTACATGG CATAAGACAT GAAAACAATC 1200 

ATGTCACTAA AGAGAGGCAG ACGCGCTATC GGAGCAGCGT CGTTAATCAC ATAGGCCCCA 1260 

ATCCAGTAAA TCGCCACACT CAAACCACTT GAAATCCCCA TCATGATAGG ATTCAAAATA 1320 

GCCATAAGAC GGTTGACAAA CAAATTCAAA CGGGTCAATT CATCATTTAC TGCTGCAAAT 1380 

TTTTCATTTT GATAATCCTC TGCATTGTAG GCACGAACGA CACGAATACC TGTTAAACTC 1440 

TCACGAGTGA TACTGTTCAG TTTATCTGTC AGCCCCTGAA TCAAGGACTG TTTTGGAAAG 1500 

GCTAGCGTCA TCAAAACGGT CGTCATCAGG ACGTTGATAA TCACTGCCAC AAGTACGGCC 1560 

CAGAGCCAGT ATTCTGAATG ACCTAAAATC TTCCCAATAG CCCAGATAGC CATAATTGAA 1620 

CCACGCGTTA CCACTTGCAA GCCCATAGTA ATCAACATTT GAACTTGAGT AATGTCATTG 1680 

GTAGTACGCG TCAAGAGGCT AGGAATTGAA AATTTCTTAA TCTCTGTCTG CGAGTAATCC 1740 

AAAACTCGGT TAAAAATATC ACTTCTCAGC CTACTAGTAT AAGAAGCCGC CACTCGGGAT 1800 

GCAAAAAATC CAACTGCAAC TACGGACAAG AAGGCAAGAA AGGACATTCC CATCATCATG I860 

CTTGCCGACT GCCACAACTC ATCTAAATTA GTTTCTTGAC TACCTAGCAA ATCCGTAATT 1920 
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TTCGAGATAT 


AGGTCGGCAC 


TTCCAACTCT 


AGATAGACCG 


AAAAGCAAGT 


AAAGAGAATG 


1980 


GCTAGTAAAA 


TCATCCCCCA 


TTCTTTTCTA 


CTAATTCTTT 


TGGCTAATTT 


CTTTATTCTC 


2040 


TCCTCCTATT 


CCCTTGATAT 


TTTGCCTGTA 


GTTGACCGAG 


AACCTTCTCA 


AAAATCAGTA 


2100 


ATTCATCTTC 


ATCAATGTCT TCCATCAACT 


GCTTGTCTAT 


GCGTTCAAAA 


AAAGCCTTAA 


2160 


CCTGTTGCAT 


CTGAGAACGT 


GCTTTGTCCG 


TCAGACGAAC 


AAACTTAGCC 


CGCTTATCAA 


2220 


CAGGACTCGC 


CTCCAATTCC 


ACCAAACCAT 


TTTGCACTAT 


ACGCTTAACC 


AGATTACTAG 


2280 


CAACAGGCTT 


GGTAATATTG 


AGTTCCTGCT 


CGATATCTTT 


AATCAAGACC 


AAGTCTTGGT 


2340 


TTTTCTCGCG 


ATTATCCAAA 


AAACGCACAA 


CCTGACCTTG 


CGGCCCACCC 


ATAAATTCAA 


2400 


TGCCGCAACG 


TTTGGCTTCC 


TTTTGCACCA 


TCAGGTGAAT 


TTGATGACCA 


AAACGCTTAA 


2460 


AGACTAACAT 


CGGTTTATCC 


ATAATCTCCC 


CCTTCTAAAT 


AAAAATAGTT 


CTCTGGAGAA 


2520 


TAATTAAATT 


TCTATGAGAA 


CTATTTTCTT 


GATTAAAAAA 


ATCCCAAGTG 


ATTTTCTCAC 


2580 


TTAGGATCAT 


GTTCTATAGG 


TTAAATTAAA 


ACCCATCTAC 


GTTCGTATAA 


ATCTTTTGGA 


2640 


CGTCTTCGTC 


GTCTTCAAGA 


ACGCTGTAAA 


GTTTTTCAAA 


GGTTTCAAGG 


TCTTCGCCTG 


2700 


ACAATTCCAC 


TTCTGACTGA 


GGAATCATTT 


CCAATTCAGT 


CACTTGGAAT 


TCTTCAATAC 


2760 


CAGACTCACG 


GAGGGCAACG 


ATAGCCTTGT 


GAAGGTCAGT 


TGGCGCTGTG 


TAAACTGTGA 


2820 


TTGTACCTTC 


TTGTGCTTCT 


ACGTCATCCA 


CATCCACATC 


CGCTTCGAGC 


AATTGCTCAA 


2880 


AGACTGCGTC 


CGCATCTTCA 


CCTCCAAATA 


CAATAACACC 


TTTGTTGTCA 


AAGAGGTAAG 


2940 


AAACAGAACC 


TGAAGCGCCC 


ATGTTTCCGC 


CGTTTTTACC 


AAAGGCTGCA 


CGGACATTGG 


3000 


CTGCTGTACG 


GTTGACGTTA 


GAAGTCAAAG 


TATCCACAAT 


TAGCATAGAG 


CCATTTGGCC 


3060 


CAAAACCTTC 


GTAACGTCCT 


TCTGTAAAGG 


TTTCGTCTGT 


GTTTCCTTTG 


GCTTTATCAA 


3120 


TCGCTTTATC 


GATAATGTGT 


TTTGGCACTT 


GGGCTTGTTT 


AGCACGGTCG 


ATAACGAATT 


3180 


TCAAAGCTGA 


GTTTGATTCT 


GGATCTGGAT 


CACCTTTTTT 


AGCTGCTACA 


TAGATTTCTA 


3240 


CACCAAATTT 


TGCATATACT 


TTAGAGTTAG 


CTCCATCTTT 


AGCCGTTTTC 


TTGGCTACGA 


3300 


TATTGGCCCA 


TTTACGTCCC 


ATTAGGAATC 


TCCTTTTTTC 


ACATTTTAAT 


CTTTCTTATT 


3360 


ATAACACAAG 


TTTTTTTGAT 


TTTCACTAGA 


GGAAATGGAT 


TTTATTAGCA 


AATCAAGCTA 


3420 


GGATAGCACT 


TTACCTGCTA 


AGATGGTCTT 


GCCTTTCTAT 


CTTTATCAAC 


AGGCACTCAT 


3480 


CCACATTCAA 


AAAACAAACT 


AGACCATTAT 


CTGCAAATAG 


AAAGTTTCAG 


CCAAGTTTGA 


3540 


CAAAGTGAGC 


TCAAATTACT 


GTTTGAAGTT 


TGTAGATATA 


AGCGACAAAA 


ACAATCATAC 


3600 


TGCACCTTTT 


GTTGACAGTC 


TACTCCAGAC 


ATATCATAGT 


TCAAGTAAAT 


ACTTTGAAAT 


3660 
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TCAACAGTTC 
ACCTCTAATA 
AACACTGTTT 
AATTTACGTG 
CTAGCAATTG 
ACAGCAAAAT 
GGATAATCAA 
TGTTTGATAG 
ATAAACTTTC 
TCCTAGTCAT 
ATCGTCCGCT 
CAGGCAAGCC 
GGAATACTAG 
GTCAAGACTT 
GCAACATAGG 
GTCTCAACAT 
GTTGTATAGG 
AAGTCCTCTA 
CCTAAAATAA 
GCTACCTTGA 
GTGGTTTCCA 
GGACGCTCTG 
ACATCTCCTG 
ACAATCTGAA 
CCTGTCTTTT 
ATTTTTTCAT 
TTAGCCAACA 
AGAAAGCTAT 
TCGACCGCAA 
CCAGGATAGA 



TTATAGGCGC 
CTCAATAAAA 
TGAGGTTGCG 
TTCCCAAGAT 
ATTTGTTCAT 
ATTTCCGATA 
TCCCCTGTAT 
ATTCATTTTA 
AGATATCCGC 
TTTCTACCTT 
TACGCACTAG 
CGGTACACTC 
TGGTAAAGTG 
CCTTACCTTG 
CACCACTATG 
AGTCTGTACT 
AAATCGGCAA 
CCATATCCAC 
CCGCCTTCAC 
CTCCTTTTAT 
ACTTGAGATA 
CAGAAATTCC 
GATTTTTAAC 
TCTGCTTTTC 
CAAAGTCAGA 
ACTGCATTCT 
AATCGTTTAC 
CGCTACTTGC 
GAAGAGTGGG 
GGCGACTGTC 



TATTGTATTC 
ATCAAAGAGC 
GATGGGGCTG 
GGAGAAGTTA 
ATTTAATTTC 
CGTGTCGTTC 
ATCAAGGAAT 
ACATCACGAG 
AGAGAGATCA 
ATCTTCTACC 
TGGCAAATCG 
TCTAATTTTG 
AGCCGTTAAA 
ATGATCATAG 
ATCCAGCAGT 
ATTTTGAAAG 
GCCTGGATGA 
CTTGCCTGTT 
TTCTGTATTG 
CAAAGCTTCA 
GACTTGGCGC 
TCTCTGTTTT 
AGCATCTACG 
GCCTGAATGG 
ACCAAACTTG 
AGCTGGGACA 
CGCTCCGCGA 
CAAACCAGGC 
ATTATTCTCT 
GTTGGTAGCT 



292 
TAAGAAATCA 



AAACTAGAAA 
ACATGGTTTG 
GACTAGTACA 
ATTTTTTCCA 
TTGAATTTCC 
TGGCTACCCT 
CATACTCCAA 
TCGCCTCTTT 
TGAGGATAGA 
GTTTTTTCAT 
ACAGAGAGAT 
TCCTGCCCAT 
GATAATTCAT 
AAATCTCCGT 
GTCGCAACTA 
TCTGCTGTAA 
ACAACTCGGG 
TCCAAAATCT 
AAAGCAGCCT 
CCATAAGCAA 
AAATCCTCTA 
CTGACTGTAT 
ACAGAGTTAA 
ACCTTGAGTT 
TTATTGACCT 
ACACTTGAAT 
AAATCAATAC 
AACAAGGTCT 
GTTACAGAAA 



ATAGAAGAGT 
GCTAGCCTCA 
AAGAGATTTT 
CTGGCACTTC 
TAAATGGGTA 
AATCATCTAA 
TTTTACTTTT 
TGGAAATCGC 
TTGTCGCAAG 
GAGTTGTTCC 
AAACCGTACG 
TACGAACATT 
TTCTGTCGCA 
TCCAAGTAAT 
TTCTGTAAGC 
CATTGTCACG 
AGCGACTGCC 
CACCCGAACT 
GTTTCCACTC 
CTACTTCATC 
CACTCGAAAT 
CCGTTACAGT 
AATAAATCTG 
AATCAATATC 
GTTCCATGCT 
GACCATAATC 
TGCTGGGGTC 
TATAAGTCAT 
CATCCACTAC 
TATCACTTGT 



TTCTAAGCAA 
GGTTGCTCAA 
CGAAGAGTAT 
TAAAACATTG 
TTAGATATAA 
AACAAGTAAA 
TTACACATTC 
TAGGCAAGAG 
CATTCTCCTC 
CCAAATAGAA 

CCACCATTCC 
CCCTTTTAAA 

AGCCTTAGGA 
ATAATATTGG 
TGTAACCTTA 
TAAAAAAGAA 
TTCTTGAATC 
TGGGTCGCCC 
TGTCTGAGGA 
ACTCTTACTC 
ATAGACCAAA 
ATCTTGAAAC 
CTTAAAATTA 
AAGAGAATTC 
GTGAGCCGTG 
TTGATGCCAC 
TTCCACTTGG 
CGGAGCACGA 
GAGAAGTGCT 
ATTTGTCGAC 



3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 
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AAGCTCCGCT TCTTTCTTTC GATAACAACA AACTCATCGG GTAGCTGATT 


ACCCTCTTTG 


5520 


ATGAAACGAT 


TTTCAATACT 


TTCTCCCTGA TGGGTCAAGA 


GTTTCTTTTT 


ATCGTAATTC 


5580 


ATAGCTAGTA 


TAAAGTCATT 


TACTGCTTTA 


TTTGCCATCT 


TCTACCTCCT 


AATAAGTTCC 


5640 


TGGATTGAGT 


TGCATAAACT 


CAGACTTGTT 


CAGCGAAATC 


AGCCGTGGTT 


GGACTAAGTA 


5700 


ATCCAAAATT 


TCCTCGTACA 


ATTCTTGTGA 


GACATTGCGT 


CGCCGTCTGG 


CTAAATAAGA 


5760 


AGTCGGAATG 


ACCGTATTAT 


CCAACATAAA 


TACCTTATCT 


AAGTCAATCA 


AGGTTGGTCT 


5820 


TGTAAAAGGA 


TTACGAGCTA 


GATCCGGCTC 


TTCTATCATA 


AAGTTCTTGA 


CCAAACGTCT 


5880 


GGTCAAGAGA 


GCTGGTTTGA 


AGGTCTGATT 


TTTAACCAAC 


TCTTTGTTTT 


TAGTCATGCT 


. 5940 


GTTGTCAATA 


CAGATATACA 


TATGATTCTT 


CACAGCCAAA 


TCGCTACTAA 


TAGTCGGAAA 


6000 


AGGCAAATAA 


AGAGCTACAA 


CATCTCCTCT 


CTTAATCAAG 


CAAGAGCACC 


CCCTTTTCTC 


6060 


CTAATGTAAC 


ATAGACAGGA 


TTGACCAAGT 


CTTCTGATTG 


ACTCAGAATT 


TCCAAAGTTT 


6120 


GAGTTTGGCG 


CGCTGTCAAT 


TTAGTAGCAT 


CTTGTCTCTT 


CAATACAAAA 


TGCTTGTCGC 


6180 


CAATAACCTT 


GACAATATAA 


TCCTTCTCCA 


AAGCTGACTG 


GTAAATCCAC 


ATCAGATGTT 


6240 


GTCTGTCCTG AGAACTCAAG 


AGAGAAGGAT 


TTTCAAGCCT 


CCCGATAGTC 


TGATAAAAAT 


6300 


CAAAAACAGG 


AGCTAACTCC 


TGCCAATCTG 


ATTGGCTAGT 


TGTCAAGGCT 


AGAAAAAGGG 


6360 


CTTTGCGAGC TGATACTTCT 


TGGTTAGCCT TGAGAGTTAC 


TTTCCCCTCC 


AAGTTTTTTA 


6420 


GAAATCGGGA 


AACTCCAGAA 


AGCAAATTTT 


TCTCTAACTG 


CGAGAAATAA 


AAACCTTTCG 


6480 


TTCCCAGACA 


TAAGTCTTTC 


ATGTCGCTTT 


CTCTAGCAAA 


TAAGAGCTCA 


AACATTTGAT 


6540 


AGTAAAAGAA 


AAATATCTGG 


CACTGGGTCG 


CGCTCATCTT 


TTCCTTATCG 


GCTTCTTTTT 


6600 


TTAACCAGAG 


CAAGGGCGAC 


AGGTAGCTGG 


ATTGAGACAT 


TTCCTCTACC 


TCCTACTCTT 


6660 


TTTTAACTGG 


AGCATCTGCA 


CTAGCTGCCA 


CTTCTTTTGA 


CTGGATACTT 


TCCCACTGGT 


6720 


TAATCTCCTC TGAGATAAGA 


CCTTCGCATG 


TCTTGACAAA 


TAGGGCAAAA 


GCCTTGGTCT 


6780 


TTCCTGCATA 


TTTCTCCGTT 


TGGCATTGAT 


AGAGGAATTT 


TTCTTTCTCC 


AGGAGTTGCG 


6640 


CAGTTTTTTG 


GTAAGAAATC 


CAATTTTCCT 


TTGCATTATA 


CAAATTGATA 


ATCCCCTCAC 


6900 


ACAGCAAGCC 


GAGACTGGAT 


AAGGCAACCG 


AAATCAAACG 


GTAGCGATCA 


CCTGGCATAG 


6960 


GAATAGCACA 


AAAGACAGCT 


ATGAGGAAAC 


CTGCCACGAT 


TTCTGTTATT 


TTTAATACCT 


7020 


TATAGCGCCT 


ACGATGTTGA 


ACGCTTTTCT 


TTAAAAAATG 


AGCTATCTGT 


ACGTCTAATC 


7080 


GCTCTGTCAG 


GTACATTTCT 


TCTGGCGTCA 


TATTCGTAAC 


TCCTTTCATT 


TACTTTGATA 


7140 


ATCAGGG 












7147 



WO 98/18931 



PCT/US97/19588 



294 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 755 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
CCGCATGGGA TTGGTGTCCT TTTGGGCAAT CTCTTTGACC AAACTGGAAA CATGTTTTAT 
GCGCCTGCCT TTACTGCCCT TGTCGGCGGT ACGTCTATAT GATCCTAGTC GCAAAAGTTC 
CGCGCTTTGG AGCCATTACC ACTATCGGCC TTGTCATTGC CCTCTTTTTC TTGGGAACTA 
AACACGGTGC TGGTTCCTTC CTTCCTGGAA TTATCTGTGG CCTCCTAGCA GATGGAGTAG 
CTCATTTAGG AAAATACAAG GACAAAACAA AGAACTTCCT TTCTTTCATT ATTTTCGCCT 
TTAGTACAAC AGGACCAATC TTGCTTATGT GGATTGCGCC CAAAGCCTAT ATGGCTACTC 
TTCTGGCAAG AGGAAAATCC CAAGAATATA TCGACCGTAT CATGGTCGCT CCAAACCCTG 
GAACTGTCCT TCTATTTATC GCAAGTATTG TCATCGGAGC CCTAGTGGGT GCCTTGATTG 
GACAAGCCTT GAGTAAAAAA TTTGCCCAGA AAATCTGATC AGTTAAAAAG AGCCACGCGG 
CTCTTTTTTA TTTATGGCTC AATTTCTTAG TCAAGAAATC TCCCAAGAAT TGGATTGCAA 
AGATAATCAA AATGATAATA ATGGTTGCCA AGATGGTCAC ATCGTGATTG TAGCGGTTAA 
ATCCATAAGC GATGGCTACG TTACCGATAC CACCAGCTCC AACCGCACCG GCCATAGCTG 
TTtcCCAACA AGGGaAtCAA GGTcACAGTC GTCAC 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

TTCAATTGGT ATCTCAATCA ACGGTCTTCA CATGGTTTCA ACTGGTTTGA CTCTTGAAAA 60 

AGCGAAAGCT GCTGGTTACA ACGCAACTGA AACAGGCTTT AACGATCTTC AAAAACCAGA 120 

ATTCATGAAA CATGACAACC ATGAAGTAGC AATTAAGATT GTCTTTGACA AAGATAGCCG 180 

TGAAATTCTT GGTGCCCAAA TGGTTTCACA TGATATTGCA ATTAGCATGG GAATCCACAT 240 

GTTCTCACTT GCTATCCAAG AGCATGTGAC AATTGATAAA TTGGCATTGA CAGACCTCTT 300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
755 
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CTTCTTGCCA 


CACTTCAACA 


AACCATACAA 


CTACATCACA 


ATGGCTGCCC 


TTACGGCTGA 


360 


AAATTAAAAA 


TGAATGAGCT 


ATCTGGCCTT 


AAGTTAAGGT 


CAGATAGTTT 


TTAGCTAATT 


420 


TGTCCCCATA 


CAATTATAGT 


TTTTTTATCT 


TGTGCTTCAT 


TCTGTTCTGA 


CTTAAAATGA 


480 


AAAGGTAGCT 


ACCAATACAA 


ATGATGAGGA 


TAAAACAAAT 


GACTGAAAAT 


CGTTATGAAC 


540 


TAAATAAAAA 


CTTGGCACAG 


ATGCTCAAGG 


GTGGTGTTAT 


TATGGATGTG 


CAGAATCGTG 


600 


AACAGGCTCG 


TATCGCAGAA 


GCTGCTGGTG 


CGGCAGCTGT 


GATGGCCTTG 


GAACGAATTC 


660 


CGGCTGATAT 


TCGTGCAGCT 


GGAGGAGTTT 


CCCGCATGAG 


CGACCCAAAG 


ATGATTAAGG 


720 


AAATCCAAGA 


AGCGGTTAGT 


ATTCCAGTAA 


TGGCTAAGGT 


CAGAATCGGG 


CATTTTGTTG 


780 


AAGCTCAGAT 


TTTAGAGGCT 


ATTGAAATTG 


ATTATATCGA 


CGAGAGTGAA 


GTTCTATCTC 


840 


CAGCTGATGA 


CCGTTTCCAT 


GTGGACAAGA 


AAGAATTCCA 


AGTTCCTTTT 


GTCTGTGGTG 


900 


CTAAGGATTT 


GGGTGAAGCC 


TTGCGTCGTA 


TCGCTGAAGG 


TGCTTCCATG 


ATTCGTACCA 


960 


AAGGAGAACC 


AGGGACAGGG 


GATATCGTCC 


AAGGTGTTCG 


TCATATGCGT 


ATGATGAATC 


1020 


AGGAAATTCG 


CCGCATTCAA 


AACTTACGTG 


AGGACGAGCT 


TTATGTTGCT 


GCCAAGGATT 


1080 


TGCAAGTCCC 


TGTAGAATTG 


GTCCAATATG 


TTCATGAACA 


TGGAAAATTG 


CCAGTTGTAA 


1140 


ATTTCGCTGC 


TGGAGGTGTT 


GCAACGCCAG 


CAGATGCTGC 


GTTAATGATG 


CAATTAGGGG 


1200 


CAGAGGGGGT 


CTTTGTCGGT 


TCAGGTATTT 


TCAAGTCAGG 


AGATCCTGTT 


AAACGAGCGA 


1260 


GTGCCATTGT 


TAAGGCTGTG 


ACTAACTTCC 


GTAATCCTCA 


AATCCTAGCT 


CAAATCTCTG 


1320 


AAGATTTAGG 


AGAAGCCATG 


GTTGGTATTA 


ATGAAAATGA 


AATCCAAATT 


CTCATGGCTG 


1380 


AACGAGGAAA 


ATAGATGAAA 


ATCGGAATAT 


TGGCCTTGCA 


AGGGGCCTTT 


GCAGAACATG 


1440 


CAAAAGTGCT 


AGATCAATTA 


GGTGTCGAGA 


GTGTAGAACT 


CAGAAATCTA 


GATGATTTTC 


1500 


AGCAAGATCA 


GAGTGACTTG 


TCGGGTTTGA 


TTTTGCCTGG 


TGGTGAGTCT 


ACAACCATGG 


1560 


GCAAGCTCTT 


ACGTGACCAG 


AACATGCTAC 


TTCCCATCCG 


AGAAGCCATT 


CTATCTGGCT 


1620 


TACCAGTGTT 


TGGGACCTGT 


GCGGGCTTAA 


TTTTGCTGGC 


TAAGGAAATC 


ACTTCTCAGA 


1680 


AAGAGAGTCA 


TCTAGGAACT 


ATGGATATGG 


TGGTCGAGCG 


TAATGCTTAT 


GGGCGCCAAT 


1740 


TAGGAAGTTT 


CTACACGGAA 


GCAGAATGTA 


AGGGAGTTGG 


CAAGATTCCA 


ATGACCTTTA 


1800 


TCCGTGGTCC 


GATTATCAGT 


AGTGTTGGTG 


AGGGTGTAGA 


AATTTTAGCA 


ACAGTGAACA 


1860 


ATCAAATTGT 


TGCAGCCCAA 


GAAAAAAATA 


TGTTGGTAAG 


TTCTTTTCAT 


CCAGAATTGA 


1920 


CTGATGATGT 


GCGCTTGCAC 


CAGTACTTTA 


TCAATATGTG 


TAAAGAAAAA 


AGTTGAGATT 


1980 


GAATTTCTCA 


ACTTTTTTAC 


ATGTAATAAA 


CAATAGCGAT 


GTATTGAAGT 


GCGGACGCAG 


2040 
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CTAGGATAAA GAGATGCCAA ATCATGTGGA AATAAGGTTT TTTCTTGGCA TAAAATCCAG 2100 

CTCCAACTGT ATAACAGAGT CCGCCAGTTA CCATGAGACT CCAGAAAACG GGTGTCGTTT 2160 

GACTGATAAT GGCAGGAATG ATAGCCAGAA CCAACCAGCC CATAATCAGG TAAAGAGCAA 2220 

GGCTAAATTT CTCATTGACC TTTTTAGCAA AGATTTTATA GAGAATACCA AAGATGGTCG 2280 

TTCCCCATTG GATGACAATA ATCAGATAGC CAAACCAGTT ATT CATC AAG GTCAAGACAA 2340 

CGGGCGTGTA TGAGCCGGCA ATGGCAACGT AAATCATAGA ATGGTCAATG ATTCGCAAAA 2400 

CATATTTGTG GGTCGAACCA TAGGCCATAG AGTGATAAAT GGTGGATGAT AGGAACATGA 2460 

GAAAGAGACT GATGACGAAA ATGGAAACGC CGATAGAGGA TAAAAATCCG TGTGCTTCAT 2520 

AACTATAGAT GGATGAAATA GGCAGCAAGA TAAGCATGAT GACTGCACCC ACAGCATGGG 2580 

TCACGCTATT AGCAATCTCC TCTCCAAAAC TGAGTTGTTT GCTGAGTTTA AGACTAGTGT 2640 

TCATTGGATT ACCTCCTCTT GAGTATGATC GATTAAGTCT AGAGTTTGAT GATAGAGTTT 2700 

AACGGTTTGG CAGCTGGTTT GGATAATAGG GTTAGCTGGG TCAATTCCTT GGTTCATGTA 2760 

GTCCACAAAA GCATCGTAGA GTTGGTCTGA ACTTGCTTGA GTTTGTAGAG TATTAAGTGT 2820 

CTGGGCTATT TCTTGAATAG AAAATACAGA CTTGAGGGTT GTGATAGCAA TCAAACGGGC 2880 

AATCTGTTGG CGTTGGTATT TTTTTTTGTC AGGCTTTGTC AGGTAACCAT TTTTCACATA 2940 

ATTGTTGACC ATAGATGCTG TTAGGCCCTT GTCTTTATTA GGAGAGATAG GGGCGCAGAC 3000 

CTGATTGACA 301(J 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS : 

<A) LENGTH: 15213 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

CATAAATCGG TGCAAATAAC TTAATAGTGA AGTAGCCATT TCTTTCGTAT TTACCTGAGG 60 

CATATTCCCT AGACGAAAGA ATATTATTAT CAATCAAATC ATTGAATGAA CGTAGTCTTT 120 

CAACTTCTTC TACTGTTAGA TTTCTGACAA CATTTGTTGC ATAGACCTTA TTTCCATCAG 180 

GATCAGGATG GTACTCATTT GTAACTTTTC TAAGAAGTTG TTGTTTTTGA TTCGTATCCA 240 

ATTTAAGAAT TGAATTTCCT TCGAGATATT CCAACATATA AACAACGTCA AACATGTTGT 300 

GGACATATTG CTTCAAATCA TCTGCATTAT TAAATCTTGT AGTTGGATCA AGTACTTGTA 360 

ATCGTCGACT TTCTGTACTA TCAGATTTTG AATGTTTCAA GATGGAGTTG ATGGTAATGG 420 
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TCGCATCATC TGGATGGTCT GGTGCTTGTA ATAATCCTTT AGCAAAGAAC TCTGGTCCCA 480 

AGCCACTTCT TCGACCATAT CCTCCAAGAT AAATGTCCTG ATCTGAGTCA TGTGTCATCT 540 

CATGCGTATA AGTAATAGCT CCATCCTTAT CCAACATTCG ATAACCCATA TAATAAACTG 600 

CATCACCTGT AGCATAAGCA CCGTGTTGAT TATGCCCAAC TTTATTTCCA ACAGGTCCAA 660 

AGAAATGTTG CATTGCAGGA TTTGGATTAT CAAAATCTGC CACTTCTGTA GCTTTCCCTA 720 

CGGTATTATC ATCGCCAAAT TTATAAGCAT CGTAAAGCAA AATATTTCTA TAAAGTTTTT 780 

CACGTGCATT GTCGTCTAAA ATACGATACC AATAATCGTA GTGATCTCGC TGACGTTTGG 840 

CTGTTTCACG CGCATTTTCT TCAACAAAAT CATTGAGAGC CTTGCCCGCT TTATGGTCAC 900 

TACTGCGGTA GCGATCATAA GCTCCAAATC CTAGACTAGA CATGGTCGAG ATGACAAATA 960 

CGGATCTCTC TGGCAAGGTC AGGAGAGGCA AGACCATATT GCGGTATTTC CATGTGGCAC 1020 

TCGTGATACG ATCATAAACA CCGATAGAAT ACTTGGTGCC AGCTAACCCT TGCTTCGTTT 1080 

TCACCTCTTC GATAGTGGAT TTTTCTTCGA CAATGTAAGC CTTAGTCTCT GATTTAAACC 1140 

AGTCATTATT GCTTGTATTT GGTAAAAAGA CTTTTCGGTA ATGTTCCAGC GTGCTAAACA 1200 

AATCTGTCGT TCCATGTTGA CTGGCAAGAC TGATACCATA AGTATCGACA TTATTCTTAG 1260 

CTAGAAGATT GTTAAAGCCA GATTTACCCA ACTCAATCAG AGTATCTAAT GGTGAAGCAT 1320 

TCCCCTTACC AAAGAAGTCC AAATGGTACA GAACTAGGTC TTTGACATTC ACCTGACCAT 1380 

AGCTAAAGTT ATACCACCGT TCCAGATAGG TCAAGCCAAG TAGCAAGGCT TCCTTGTTGC 1440 

GTTTGATTTT ATCTACAAGA TAACCTTCAG TGACGGGGTT AGCACT AG CC AGTCCAGCAT 1500 

CCGCTGACAA GAGTTTTTTC AAACTGTCTT CCAGTTGTTG TTTTGTTTTG GCGAACTGGT 1560 

CTTCTAGATA GAGCTCAGTT TGCTTGACGT TTGGAGAAAT ACCCAGCGTG TTTCTGATGG 1620 

CTTCTGAATG ATAGTCAACC TTTTGTAAGT CAGGTAAGAC TTGCTTGATG ATAGAGGTTT 1680 

GGTCATACAG GAATTGGTTT GGCGTATAGA GAAGTCCAGT ATTGCCCAGA CTATATTCTG 1740 

CTAATTTGGC GAAATCATTC TGGTATTTGA GATCCAGCTT CTCAGATAAA TCATCCTTGT 1800 

AGTGAAGCAA GAGTTTGTTT GCAGTCTGTT TGTTAGAAAC AATGTCTGTG ATGACTTGGT 1860 

TGTCCTTCAT CATGACTGCT GACAAGAGTT CTTTTTGATA TAAAAGACTG TTCTCATTGA 1920 

CCAGGTTTCC GTATTTGACG ATGGTTGCCT TGTTGTAGAA AGGTAGCAAT TTTTCAATGT 1980 

TTTTATAAGT CAAGTTGCGC TTAGCTTGAT AATAGGCCAC CTTAGAAAAA TCACTGTCTT 2040 

TTTTGCCACT TGTTGAAAGT GGCTCCACTG TTGGTAAAAT GAGAGGATTG ATTTCTGCTT 2100 

TTTTGCTTGC AATTTGAGAA GCATCTAGCA TTGTTCCTCT TTCTTCAAAG GATTCCTTGC 2160 
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TGACGACCTC ATCCTTGACC AAGGTGACAT TGTAGACTCT GTTGGCCTTG CTGCTGAATG 2220 

TGTCCTTTAC CTTCATTTCG TTATAGTGGT AACCAGTGAT GGCATTTCCG TTGGTTACAT 2280 

TAACATCGCT GAGAACATTG GTCAAACTTC CAGCATGCCT AACATCACCA GAAGTTCGAT 2340 

CCCACAAATT GCCTGCCACT CCAGCGACTC TACCAAAGTG CTTGACATTG TTGATATCAC 2400 

CTTCAGCATA GCTATCTTGG ATCTGTGCAT CTCGGTCTAC TAGGCCTGCA AGTCCACCCA 2460 

CAGTCTGATC TGAAGTATTT GTGTTAGATG AAATGGCTAC TGTCGCTTTT GACTTAGTAA 2520 

GTAAAGCCTT GTCACCTGTC AAATGACCGA CCATACCACC GATATTGTAG GCAGCAGTCG 2580 

TTTCATAAGT GTTGATAATT CTTCCCTTGA AACTGCTCTC TGTGATGCTT GATTGCTCAG 2640 

CCTTAGCCAG CAAACCACCG ATACCACGTT CACCAGCCAG AACACCATCG ACGTGAACTT 2700 

GCTTAATTTT TGTGTTATTC TGAGCTTCAT TTGCCAGTGA ACCGATATCA TCTTTCCCTG 2760 

AAATAGCAAC ATTTTTTAGA CTCAGTTTTT CTACTGTAGC ACCACTCAAG TTTTCAAACA 2820 

GAGGTTTTTT CAAATTATAG ATAGCATAAT TCTTGCCATC TTTTTCACCG ATTAAACGAG 2880 

CAGTAAAGGT GTCCTTGATA TAGGATCTTT CATCAGGACC AAGCTCCACT TCGTTAGCAT 2940 

TCAGGCTGGC CGCTAAATGA TAGGTTCCAG AGGGATTTTG GTTTATAGCT TTGACCAGAT 3000 

TACTAAAGGA AGTAAAGTTT GTTGTTTCTT CTGTTCCCTT CTTAGCTAGA TAGAAGGTAA 3060 

AATTATCTTT ATATCTGCTT TCTATCTCCT GCTGAAGCTT CTCTACTTTT GCTGTGATTT 3120 

TATAAAGGAT TTTATCATTT TTTCTTTCCT CTGATATTGA TGCTACTGGT AGGTATACAT 3180 

CTTTGAATGA AGAAGATTTC ACTTTAACAA AGTAGCTATT TGGATTGCTT GGAACTTGCT 3240 

CTAACGAAAT GTGTTGTTTA TAAGTACCAT TTGACAAACT GTATAACTCT AGGTCGGAAA 3300 

CATTTCTTAA TTCAAGTGTT TTCTCTGGTT CTTCTACCTT TTTATCAGGG TCTAGTTCAT 3360 

TTTCTTGTTT AATTTCTTCG TTTCCATTTG AATTGGATGT GTTTGATTCG GTTGAAACAT 3420 

CCTCAGTTGA ATTTCCGTTT GATGGTTCTG GTTCTGTTTG TCCATTCTCT GATGTTGTAT 3480 

TACCTGAATT TTCTGGTTTT GTTGCAGTTC CGTTTTTTTC TGGTTGATTT GATTCTTCAA 3540 

CTGGTGGTTT TGAATCACTA GGTTTATTGG ATACTTCTCC AGTATTTTCG TTAGCTATTT 3600 

TCCCAGAGTT TGTTTGTGTT TCTTCTGCAG GTTGAACTGG TTTTTCTGTT TCTTGATTTG 3660 

AGGTACCTTC TACTGTGCCT TCATTTGGAT TTACTGGAAC TTCTTCTACA GTTTTTTCTG 3720 

AATTTTCATT TTTAGAGTCA TTATGTTCTG GTTTATTTGA TTCTCCAACT GAGGTTGTCG 3780 

AATCACTAGG ATTACTGGAC ACTTCCCCAG TATTTTTGCT AGATGTATCT GGTGATACTT 3840 

TCTCTGAATT CGTTGTTGAT TCTTCTGCAG GTTGAACTGG ATTTTCTGCT TCTTGAATTG 3900 

AGGTTCCTTC TGTAGTACCT TCATTTGGAT TTACTGGTGT TTCTTCTGTT GGTTTTACTG 3960 
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GAACTTCTTC 


AGTTTTTTCT 


GGACCTTGTT 


CTTTGGTCTT 


CTCAACCGGA 


GTTTCAGGTT 


4020 


TTACTTGCTC 


AATATTACCC 


TTATATTCTG 


GAAGCGGTGC 


TACCTGCTCT 


GGTTCACCTT 


4080 


TATCACTTAC 


CACAGTATCT 


GGCGACTCTG 


GTTGAACCTC 


AGTCTCACCT 


TTGTCGGTCA 


4140 


CAACTGCTTC 


GGGTAATGTA 


GGTTGAACTT 


CTGGTTCGCC TTTGTCAGTT 


ACTACAGCTT 


4200 


CGGGCAACTG 


AGGCTGAATT 


GCGGGTTCAA 


CAATAGCTCC 


AGACTGTACG 


TCCTTATGTT 


4260 


CTACACCAGT 


CTCAGGTTGT 


TCCTTTATAA 


CTTGAGTTTT 


TTTAGTACCT 


TTTTCGACTA 


4320 


TTCTTGGACT 


AGGCGCAGTC 


GTTGAAGTTG 


AAAGAATTTC 


TCGCGAAACT 


TCTTCCTTGT 


4380 


TTACAGAGAA 


TATTCTGACG 


ATTTCAACTT 


TCTTACCTAA 


TTTACCTTCT 


TGTTTTACTC 


4440 


TTACAGTTCC 


TTCAGCTAAA 


TCAGGATTTT 


CTTGAATTTC 


TTCTTGAAAA 


TCTATTTTTG 


4500 


TCTCCATAGT 


TTCCTCACGA 


TATAAGAGTT 


CAGGTTTGTT 


CAATTGACCT 


GATAAAACTT 


4560 


CATCCTGTGG 


ATTTAATGTA 


TTTACCCCAG 


TCTTTTCTTT 


TGGAGAAATC 


TTCTCCTCTT 


4620 


TCTTCGTTTC 


TAGATTCTTA 




ATTGTTCTTG 


AGAATCTGAA 


GATTGTTTCT 


4680 


CTTCTTTTCT 


TGGATTGATT 


AATTCAGTAG 


AGAAAGGTTT 


TTCAACTACT 


TGAACTTCTG 


4740 


TCGGCTTAGT 


TGAAGAAACA 


GGTGTTTGTT 


CCTGAATAGC 


TTGTACTGTT 


GATGGATGGT 


4800 


CTACAAAATT 


CGGTGTAACA 


TTATAATCCA 


CCTTTTGTTG 


TTTTGTAGGA 


GTGGCAACTG 


4860 


AACTCTTTTG 


ATTACTTACT 


TCAGACTCAG 


AAGTCGTTTT 


TCCCTCTTTG 


ATATATCCAA 


4920 


TATAAGTGTA 


ACCTGAAATC 


TCTTTAGGAA 


GAGGTAATTT 


TTCTCCAGAG 


GTCAATTCAT 


4980 


AGTCCGTATT 


GTAATTTAGC 


AAAAGATGAT 


TTTCTAAAGC 


ATGGACTGAA 


ACTAAGACAC 


5040 


CATTTCCTAT 


CCCTGCAACC 


AATACTAAAT 


GTAATACCGT 


TTTATTCTTA 


ACCTTTTTCT 


5100 


TGGAAACAGC 


AAAAATTAAA 


ATTCCCATAG 


CAGCTAAGCT 


AGCACCAGCA 


ACTAGGGCTT 


5160 


GCCTCTCATT 


CTTGCTTCCA GTATTTGGCA ATTCCGCCAG 


TTGATTTTGA 


GAATTTAACT 


5220 


TATAAACAAG 


ATAATAAGTT 


TCATCATCAT 


TCTCCACGTA 


TGTCGGAATA 


TCATAGACAA 


5280 


GCTGCTTCTT 


TTCTTCTGAT 


GATAGCTCTG 


AATCTGCCAC 


ATATTTATAG 


TGAACTCCCG 


5340 


CAGTTTCTTG 


AGCATCCACA 


GATGAACTAG 


CTAATACAGA 


CATAAAAAAT 


AAACTTGAAA 


5400 


TCGTTGCAGA 


TACAAGTCCT 


ACTGATAATT 


TTCTAAATGA AAAACGCTCT 


TGTTTTTCAC 


5460 


CAAAATACTT 


TTCCATTATT 


CCTCCTTGAA 


ATAAAATTTA 


TATATGTTAC 


AAAGACCTTT 


5520 


ATTATATTAG 


TGTATTATCT 


ATTATCTATA 


GAAAAGGCAG 


TATACCTTAA 


TTATACTCTT 


5580 


AATTTACAAA 


AAAGTCTTAA 


AATTGAGATG 


CGCTTTCATA 


CTTTGTTTTA 


TATTATTTGG 


5640 


AGGTACAATA 


ACACCTACCA 


TGAAATTTAC 


ACGGTAGGTG 


TTACTCATAT 


CACTAATCGT 


5700 
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TCTAAAAATG GTTTGAGGCA GTTGAGGAGA ATTCCTTCTA TCCAGCTTCC TTGTGCTGAT 5760 

GAGCGATGGT CTTCCTGCAG GCTTTTTTTT AGAAAATCTC GGACTTGTTC TGGTGCGATT 5820 

TCAAATTCAA AGGCTTTCAT TTTATAGAAA AAGTCGATGA GATGATCTGA CAGGTATTCA 5880 

GTTGAAAAGG GTACTTCACC ACTTTTTCTA TATTCTAATA AGAGTCTAGA AAATCGAGCT 5940 

TTTTCTTCAG GAAGCTCACG AAAATAGGAA TTGAGGATCC AAGTCTGCTT CTGTTTTCTT 6000 

TCAATTGGAT CCTGACTGGC AATTCGTTGG TCTTTTTCCA GCTCTTTTTG GTATTGTTTG 6060 

GCCTTGATAG CTCGTTCTGC TCTATTTTTA CCAAAAAGAA TTTTTTCCCA CTTGCGTTCT 6120 

TCTTGAGTCA GGGTCTCTGT AAAGCCAAAG TAATCTTGAT AAGCACGCTC TGCGGGTCCC 6180 

ATGGCTAGAA CCAGATTGTC TGCATATTGC TTGGCGATTT TATCCCTCTT CTTGCGTTCT 6240 

TTCTGTGCCT GGATACGGAG TTCTTGTTCG TAGTCAATTT TCTCCTTGCC TAGCTTGACA 6300 

AGGTAGAGTT GGTCATCCGA TTTCCCAAGT AAAAAGGGTT TGATACACTT TTCAAGGACT 6360 

TCTTCCATCC GAGCCTTTTT CTTTGGTTCC GCCTTGGTCC AACTTCCTCC CTGAAAGACT 6420 

TCTAGGAAAA GCTGGTAGTC TCTCTCAGGC GCAAATTGAT TGCCACGATT GGGTTTGAAA 6480 

ACACCTTTTT CCCAGAGCCA TTTTAGAAGT CGCTCGTCAA AGTTACTTTT ATTGACCTTG 6540 

ATTTTTTCCT TTTTCTGAGC TTTTCTGGTT AGATTTTCAA CCTTTCTGAG CAGTTTTTCT 6600 

TCCTCTTCCA ATTGCTGGTC AAGGGACAAT CGATGAAAAT GACGAACACA GTCGCTACCA 6660 

ATTGGAAAGA GGCGTTGGCC TGTGACACCG TTAAAGAGTT CATAAGCGTA TTTGATGGCA 6720 

TTTCCACAGA CACAATTGCT ACGGCCGATA CCGTTAAAAA TAAAGGAAAC TTCATTCCAT 6780 

TCCTTGGTAG CTTGTTCCCA AGTATCCGCT TTCGAAGCCT GTAAAACTGC ATCGTGCAGG 6840 

GATTTTCTAA CTGGAAGTGT CATGAGGTCT CCTTTCTAAT ACTCAATAAA AATCAAAGAG 6900 

CAAACTAGAA AGCTAGCCGC AATCAGCTCA AAACACTGTT TTGAGGTTGT AGATAGAACT 6960 

GACGAAGTCA GCtCAAAACA CTGTTTTGAG GTTGTGGATA GAACTGACGA AGTCAgTAAC 7020 

CATATATACA GCAAGGCGAA GCTGACGTGG TTTGAAGAGA TTTTCAAAGA GTATAAGTTA 7080 

TACTTTTACA ACTTGAACCT CGTCTTTACC GAGTAAAATC AAGTATTTTT CAATATTTTC 7140 

AATCGAATAG GCTCGTGATA AAGCCTCTTC GTATAGAGCT AACTGACCAC GATAGCGGTC 7200 

TACGAGTTGA CTTGGTTCAT CATAGCGGTC TGTCTTGTAG TCGAACAGAA CAATTTTGTT 7260 

TTCGTAAAGC AGATAGCCAT CAAGGATACC ACGGACAACA AAGTCTTCCT GACTCTTTTG 7320 

GTCTCGTTTG AGCATGGAGA AAGGTTGCTC GCGATAAAGA TGGTCGGTAT TAGCAAGAAT 7380 

TTCCTGACCG AGTACTGTGT CAAAGAAAGC AAGAATTTTA TCAAGATTGA TCTTGTCTCT 7440 

GACAGCTTGG CTAGTTTGAA CTTGTTTGAG TGTTTCTGTT AGGCTAGCAA GGGTTAGTTG 7500 
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CTGGCTGAGG 


TCAATTCTCT 


GCATGAGTTC 


GTGAGTAGCA 


CTACCAATCT 


CAGCTCCAGT 


7560 


TACCTTTTCT 


TTGGTTGAAA 


AATCTGGCAA 


ATCGAAGCTG 


ATTTTCTTGC 


CTACTGACTG 


7620 


ACCTTGACCA 


GCAATCTCGA 


CACCTTCCAT 


ATCCATAACT 


GGTTCGTAGA 


ATTTCTTGAT 


7680 


TTGACTTGGG 


GTTTGAACAC 


TAGGAAGTTC 


AATAGCTGCG 


CGGTGAAGAG 


TATTATAAAC 


7740 


TTCCACCTCC 


TTCAGCATTT 


CCAGAGCTTC 


TTTGATGGTA 


TCTGAGTGAC 


GATTGTCTGC 


7800 


TTGGGAGCTA 


TCTTGGAGAG 


GACTCTTGGT 


TTCCAACTCT 


CCGATAGCTT 


CTGTGGTCAA 


7860 


CTGATCTTCG 


CCAATAAAAG 


GATAACTAAA 


GTTGAGCTTG 


TCCTTAGTAA 


ACACTTTACT 


7920 


GATAGCCCAA 


AGCCAATCTT 


GGAAATTCCG 


TGCTTGCAGT 


CTAGTATTGC 


TATTTAGTTT 


7980 


CCCATTTTTG 


GCTGCTGGGT. 


ATTCCTTGGA 


TTCCAGCTTT 


TCACGAGAAC 


CCTTGCCGAC 


8040 


AAGATAGAGC 


TTTTTCTCAG 


CCCGCGTCAT 


AGCAACATAC 


AGCAAACGCA 


TCTGCTCAGA 


8100 


ATAGCTTGCT 


AGCTGTAATT 


CCTCTTCGTT 


CTGCCTATAG 


GTCAGACTAG 


GAATGGAGAG 


8160 


TTTGATGGTT TTAGGATAGT GGTCTTCTAC TGCCCCTGTC TCCATCTTGG CAATATATTT 


8220 


GAGACCAAGA 


CCATTCTGAC 


GACTGAGAAT 


GACTTCTGAC 


ATAGAGTCTT 


GCTTGTTGAA 


8280 


ATCTTGATCC 


ATATTGAGGA 


TAAAGACGTA 


AGGAAACTCC 


AGCCCTTTAC 


TCTTGTGGAT 


8340 


GGTCATGAGC 


TCTACTGCAT 


GTTTTGGCGG 


TGCGACGGCC 


ACGCTTGCCA 


AATCGTGCTG 


8400 


GGCTTCTAAG 


ACTTGGTCAA 


TCATACGAAT 


AAAACGCGAG 


AAACCTTTGA 


AATTGCTCTT 


8460 


TTCAAATTGA 


TCAGCACGCA 


GTGCTAGGGC 


ATAGAGATTG 


GCCTGCCTAG 


CAGGACCATT 


8520 


CGGCAAAGCC 


CCAACATAGT 


CATAATAAAA 


ACGGTCGTTG 


TAAATCTTCC 


AAATCAAGTG 


8580 


ATAGAGAGAG 


TGGGTTTTGG 


CATACAAGCG 


CCAAGAAGCT 


AGGATATCCA 


TGAATTGCTT 


8640 


TAGTTTTTCA 


GCTAGAGCTG 


TGTGAATCAA 


GCCTTTTTGA 


CTACTTGCCA 


TTTTTTGTGG 


8700 


ATTGACCAGT 


TTCTCATAGA GATTTTCGTG 




TCTGCTTTCT 


GAAGGGACAA 


8760 


ACGTGCTAGC 


TCATCCTCAT 


CAAAACCAAA 


CATTGGAGAC 


TTCATAAGGG 


CAACCAAGGC 


8820 


GTAGTCTTGC 


AGGGGATTGT 


GAATGACACG. 


AAGAGTGTCT 


AGCATGACTT 


GCACTTCTAG 


8880 


GGATTGGAGA 


TAATTGTTTT 


GCTCTCCGTC 


AGTTTTGACA 


GGAATTCCGT 


ACTCAGACAG 


8940 


GGCGAGGAGA 


ATCTGGTCAT 


TACGACTGCG 


GCTGGAGGTC 


AGAAGGGCAA 


TTTCCTTAAA 


9000 


GGCAACACCT 


TTTTCTTGAT 


GAAGTTTCAG 


AATCTCCTTG 


ATAACTAAGC 


GCATTTCGCC 


9060 


TGTTAGTTTC 


GTTTCTGTTT 


GACTCTCTTC 


TTCCTCACCT 


GTATCGTCCT 


TGTCGTAGAG 


9120 


GAGAAATGCT 


GCCTTGTTGT 


CTGGATTGGG 


AGTCAGTTTG 


GTATTGGCAA 


AAAGAAGCTG 


9180 


GTGCTTGTTA 


TCATAGTTGA 


TTTCGCCGAC 


CTCTTGGTCC 


ATGAGACGTT 


CAAAGAGATC 


9240 
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ATTGGTTGCT 


GACAGCACTT 


CTGAACTACT 


302 
ACGGAAATTT 


TCCTTGAGGA 


TAATGAGCCT 


9300 


GCCTTCTTGG 


GGATTTTGCG 


CATAGCGTTG 


GAATTTCTCA 


TTGAAAATCT 


GCGGGTCTGC 


9360 


CTGACGGAAA 


CGATAGATGG 


ATTGCTTGAT 


ATCTCCCACC 


ATAAAGCGAT 


TGTGGCCATT 


9420 


AGACAACAAT 


TCCAGCATCC 


GTTCTTGAAT 


ATGGTTGGTA 


TCCTGATACT 


CATCGACCAT 


9480 


GACTTCATGG 


AAGCGCTCCT 


GATAAGACTC 


ACGAACTTGT 


GGGAAATTCT 


CTAAAATCTC 


9540 


AATGGTGTAA 


TGGCTGATAT 


CAGCGAATTC 


GAAGGCATTT 


TCCTGTCGTT 


TTCTCTGACG 


9600 


ATAAGCCTCT 


ACAAAATCGC 


TCATGAAAGA 


TTGGAAGGTT 


TTAGCTAGTT 


TCCAAGTGTC 


9660 


TCCATGATAA 


CGTTCTTGAT 


AGTCGAGAAT 


CGCTATCTGG 


TCTGATAATT 


GTCCTAGTTT 


9720 


AGCAAACTGG 


GTCTTTCTCT 


CTTCGTTGTA 


GGCATCAGCC 


AGGGGCTTCA 


AATCAGCCTA 


9780 


CGGCTGGCAT 


TAGTCAGAGC 


TCGACCGTTT 


TTCTCCTTAG 


AGATGGCGAC 


AACACGCGCA 


9840 


AGCACTGCCT 


GATAAGCCTG 


ACTATCGGAC 


TCCTGATTTA 


GGGAGCCAAT 


TTCATCCAGA 


9900 


ATTAACTGAA 


CATTTTCTAA 


ATAGGCAGCC 


TTTGCAAACT 


CCTTGGCATC 


GTTATCCAGA 


9960 


TGGTAACGGA 


AAAAGCTTTC 


CAAATCCCAA 


AGGGCTTGTT 


TGATTTGCTC 


GGTCAGTTTT 


10020 


TCTTTTTCAC 


TGGTAAAATC 


AGCTTTCTCA 


AATCCTTTGA 


GGAAAGATTC 


ACTCAGCCAC 


10080 


TTTTGAGGAT 


TACTGGTGGA 


TTGGAGGAAG 


TCATAGATTT 


TATAGACCTG 


CTGGCGCAGA 


10140 


CCCCGTTCGT 


CCTTGCCACG 


CCCAGCAAAG 


TTTTTCAGCA 


AATGACTAAA 


GGTCTCTTTC 


10200 


TGTTTACCTT 


GGTAATGCGC 


TTCAAAGACC 


TCATGAAAGA 


CTTCGTTTTC 


GAGAATAAGT 


10260 


TGCTCGCTTT 


GGTTTTGTAA 


AATACGGAAA 


TTAGGTGCAA 


TATCAAGCAG 


ATAACCATGT 


10320 


TTGCCAAGGA 


ATTTTTGTGT 


GAAAGAATCC 


ATGGTTCCAA 


TGGCAGCGTT 


GGGTAGGTCT 


10380 


GCCAACTGGC 


GACCCAAGTG 


TTGTTTGAGG 


TCGACATCAT 


CTGTTTCTTG 


GATTTTCTTG 


10440 


CTGATTTTTT 


TCTCTAAACG 


TTCTTTAAGT 


TCAGTTGCAG 


CCTTGACGGT 


AAAGGTTGAG 


10500 


ATAAAGAGTT 


GAGAAATTTC 


GACACCACGC 


GCCAATTGGT 


CCAGAATGCG 


CTCTGCCATG 


10560 


ACAAAGGTCT 


TTCCAGAACC 


AGCCGATGCT 


GAGACCAGGA 


TATTCTGGGC 


AGAAGTGTAG 


10620 


ATAGCTTCGA 


TTTGCTCGGC 


AGTTTTCTTC 


TGTTCCTTGC 


TCGAATTTGC 


TTCTGCTTCT 


10680 


TGCAGTTTTT 
CTCTTATTTT 


GAATCTCCTC 
TTCAAGCCAA 


CTCACTTAAA 
GCTTGCTTGA 


AAGGGAATAA 


GCTTCATCGA 


TTCAACTCCT 


10740 


GTTTTTCTCC 


GACCAGACGC 


TTGCCATCAG 


10800 


CTAGGTCCAA 


CTTTTCTAGG 


AAACGGGCTT 


GGCCCAGATG 


GTAATTGGCT 


TCAAAGCCTG 


10860 


TAATAGCCTG 


ATGTTGCTGG 


ACGTATGGGG 


CAATGCTTCT 


GCCATTTTCA 


GTATAAGGAT 


10920 


TGATGGCGAA 


CCGGCCTGCT 


AAAATCTTCT 


CAGCAGCTTT 


CTTGTAAAGA 


TAGGCATTGT 


10980 


AGTCCAGTAG 


GAGCTGAAAT 


TCCTCATCTG 


TCAGTTGATT 


AGCCTTGTTT 


TTGTTATAAA 


11040 
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ATTCGCCTAA ATAACTGCTT TCTTTTTCCA AGAAGAGCCC TTGGTATTTC ATAGATTTGC 11100 

TGGCTTCTAC CACTGCTCCT GCCAGACTTT TTACCGCCAT CAGAGATTGG ACAGGTTCAG 11160 

CCATTTCCAA GTACATGGCG CCGAAAAAGT TCTGCTCCCC TTCTCTTTTT AGGGCAGCAA 11220 

GATAGGTTGG TAACTGAGAA TTGAGCCCAT TAAAGAAATG AGGAAACTGG AACTGAGTCA 11280 

GACTGGATTT GTAGTCTACT ACTCCTATCG CTCCATTAGC TTTCAAACGG TCAATCCGGT 11340 

CCACCTTGCC TCGTACAAAG ACACTGCGTC CATTGTCTAA TTGAATAAAG GCTTGGTCTT 11400 

TTCCACCAAA ATTTGCTTCT TCTTTGATGG TTTCGATGGC TGGATTGTGT CGGAGAATAT 11460 

GTCCAGTTGT CCGTGCAACA TCAAGCAAAA CTTCCTTGGT AAACTGGGCT TCGAAACTTT 11520 

CTTGATAAAT AGCTTCAAAT TCGCGTTCTT GACTGGTTTC TTGAATAGCT TGTTCTAGAC 11580 

GTTGGTCAAA GGAATCTTCA TTAGGCAACT GTAAGGCGCG TTCAAAGATA CGATGCAAGA 11640 

AATTCCCGTG ACTACGGGCA TCAGGATGCA AACGTAATTC CTCCTGCAAG CCTAAAACGT 11700 

AGCGTAGGAA ATAACTGTAT TCATTGCGAT AAAACTCTGT CAAACCCGAC GTAGACAGGT 11760 

AAAACTCCTG TTTGGCAGGA TAGAGAGCTT GCAAGGTGTC CTTGGCTAAG GTCTTGCTGC 11620 

TTGGACTGGT TGGGATAGCT GGATTTTCCA GACCTTGCTG ATCTAGTTTT TTACCTATGA 11880 

CACGCGACAG AACCTTGACA AAAGTCAAAT CTTGCTCAGT ATCGCTCATC TCACCCTGCT 11940 

GGTGATAGGC AACCAGACTA GACAAAAGAC TGTGATAGGA CCCCATATCC TCCTTAGACA 12000 

GTCCTTTGTG ATTCATCCTC TTCTCTCTCC GCCTAAATCC AAAATGGATC AACTCTTGAA 12060 

GATAGGCAGA TTCCTTACTT TCACTTTCGT TAAAAAGGCT TGGAGCCGAC AAGAACAACT 12120 

GCTTACGAGC AGAATTGACC AAGGAAAGCA TAGTGTAGCG ATTTTTCTTG AGATTTTCAC 12180 

TGCTGGCAAT CAGTAATTGA ACGCCTTCTT CGGTCGCTTG GTTTAGGTTT TGCCTTTCTT 12240 

CATCTGTCAG AAGACTGGTG TTTTGAGAAA TTTTTGGTAA ATTGTCCTGA GTTAGTCCAA 12300 

TAGCATAGAC AAAGTCAGCA GTCAATGGTG CAATCAAATC GTAACPCTGG ACCAGAACAG 12360 

TGTCCACTGT TGCTGGAATG GTACGGTATT GGGACAAACT CATTCCAGAA TGGAGCAAGG 12420 

CTAGGAAGTC TTCCAGACTA ACCTGTGAAC CAGCAAAAAC AGTCGCAAAT TGTTCTAAAA 12480 

CATGGCAGAA AGCCTTCCAA ACTTCGGCTT GTCTTTCCTG TTCTACAGCT TCCAAAGTGG 12540 

TTGTCAAATC TTGTAACTGC TTGGTCACAG CTCCTTCTTT TAGAAAGACA CTCCATTTTT 12600 

GTAGGAGTTT TTCAGCCTTT TGTTTTCGGC TGGCAAAGAG GGTTTCAAGA GGTGCTAAAA 12660 

TTCTCAGGCG GAGGACATTC AAACGCTCAA GATTAAATTT TCCATGGTGG GATTTGGTGA 12720 

AGGTTTGCTG AAAGGCTGGC AAGCCATTGA TACCAAGATA GCGGATATAT TGCTCAAAAG 12780 
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PATP A AT ATT* 


AGACTGACTG AGGTCAGTAT 


ACAAATCAGT 


TCTAAGAAGA 


TTAATCAAAT 


12840 




AAAACGGTAA CGTTTTAAAG 


CTAAAATAGA 


CTCGACAAAC 


TGAGTCAAGG 


12900 




CATGGCTTCG CTTCTACCAA 


GATAAAAAGG 


AATCTGATAC 


TGGTCAAAAA 


12960 




AGATAACTGG TAAGAAGCTA CATCCCCCAA GAGAATACGA AAATGCTTGT 


13020 




TGAGTTCTCA TGTAATTTCP 


GACGAATACT 


ACGGGCTACT 


AGCTCCAACT 


13080 




CGTPAAAPAA fiArVAflATwpT 


GTAAATTTTC 


ACGGTCTTTC 


TCATCGACAT 


13140 




TTPTY^ZV AA If TV"* itr APR n 


ACTCCAACAA 


ACGAGAGGCC 


TTGTCAAAAC 


13200 


IATCCA.TCTT 


1 ^/ilw/Mj 1 1 iAo/iuAACAIj 1 


CCTGAGCAGG 


CGTTTGGTAT 


TTAGAAGCCA 


13260 


GATGATGGAG 


™vtl J. 1 iAtu IviUC 1 i\iv>T 


AGAGATTGCC 


CTCGCTAAAA 


GGACTGGTAT 


13320 


AGGCTTTCTT 




CAATCTCAAC 


ACCTTTGCCG 


. TGAAGTAAGT 


13380 


CCACAACCCG 


CTCTTCCTCA GCAGAAAAAC 


GAGTAAAGCC 


GTCAATGACC 


AAGGCGATTT 


13440 


GATTAAAATC 


ACTACTTACC TTGTCATTCT 


CAATAGCCTC 


AATCAAATGG 


GACAACTGAC 


13500 


TTTCCTGGGC 


TAACTGACCT TGATTAAGAT 


AGGCTGTTAC 


TTTCTCAAAA 


ATCAAGAGTA 


13560 


AATCCGCCCT 


CTTATCCTCA TCTGTTAAAT 


TCTCCAAGTC 


CAAAAAACTC 


ATCTGAGATT 


13620 


TGGTCATCTC 


ATGGTAAAGC TCAATTAACT 


GCTGGATCAA TTGAGGATCC 


TGCTTAATAG 


13680 


CGCCATAAAC 


ACGCAAGTCC TTGGGATCGA 


GTTCGGCAAG 


GCATTTGTAA 


AAGGCCAACC 


13740 


CAAGACCGAT 


ATCATCAAGA GTAGTTTTAG 


CTGGTAAATC 


ATTCAAGACC 


AGATAGCGAG 


13800 


CCATTTGAGC 


AAAGCGCGTG ACGGTAATCG 


AAAAAGAAGC 


CTGCTGGGAC 


AAGTATTCCA 


13860 


GCACGGCGCG 


TTCCTTTTCA AAAGAAAGAG 


AGTTGGGGGC 


AATGTAGAAG 


ACCCGCTTGC 


13920 


CAGCTGCAAC 


TAGCTCTTCT GCCTCTCTTG 


TTAGAATTTC 


TGTCAAAGAA 


GTCCGAATAT 


13980 


CAGTATAAAG 


TAATTTCATC TCAGCCTCGT 


TGGAATTTTT 


CATCACCCTA 


TATTATACCA 


14040 


TGATTAGCCT 


CGTAAATCTG TTAAAATATT 


TAGGCCATCC 


TTTCTTTTCT 


TCATCATCTG 


14100 


CTAAATCTTA AATACTTAGC TTTACTTGTA 


TTAGATAGAA 


TAAGTCTGGC 


TACTGAAAAT 


14160 


CACATAATAA AAAAGCCTCG GTAACAAGGC 


TTTGAGTTTT 


ATGATTGTTT 


GTTAGGTACG 


14220 


GAATACACTT 


CAATGTGTTG TCCCAGTATC 


TTAATGTCGA 


CTGGTAGATT 


GTCTGATTTA 


.14280 


TCGCCATCAA 


CATCGGACTC TAATTCGATA 


TCAGAAGAAG 


TTTTAATATT 


ACGTGCCTTT 


14340 


ATATATTCAA 


TATTCTTGAT AGAATGATTG 


AACTATAGTA 


AATTGAAACT 


ATAATAGTAC 


14400 


ACCGTGGATG 


CTAAAATATT TCTAGAAATT 


AATTTGATTT 


CCCTAATCAA 


GCTATTCGTA 


14460 


TCTTATTTCA ATCTACTATA ATAAAATGAA CCAAAAATAG TACACAATGT GGTATAATCT 


14520 


TCTTATGGCA TATTCAATAG ATTTTCGTAA AAAAGTTCTC 


TCTTATTGTG 


AGCGAACAGG 


14580 
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TAGTATAACA GAAGCATCAC ACGTTTTCCA AATCTCACGT AATACCATTT ATGGCTGGTT 14640 

AAAGCTAAAA GAGAAAACAG GAGAGCTAAA CCACCAAGTA AAAGGAACAA AACCAAGAAA 14700 

AGTTGATAGA GATAGACTTA AAAACTATCT TACTGACAAT CCAGATGCTT ATTTGACTGA 14760 

AATAGCTTCT GACTTTGGCT GTCATCCAAC TACCATCCAC TATGCGCTCA AAGCTATGGG 14820 

CTACACTCGA AAAAAAGAAC CACACCTACT ATGAACAAGA CCCAGAAAAA GTAGCCTTAT 14880 

TTCTTAAGAA TTTTAATAGT TTAAAGCACC TAGCACCTGT TTAGATTGAC GAAACAGGAT 14940 

TCGATACTTA TTTTTATCGA GAATATGGTC GCTCATTAAA AGGTCAGTTA ATAAGAGGCA 15000 

AAGTATCTGG AAGAAGATAT CAGAGGATTT CTTTGGTTGC AGGTCTAACA AATGGTGAAT 15060 

TAATCGCTCC AATGACTTAC GAAGAGACGA TGACGAGCGA CTTTTTTGAA GGTTGGTTTC 15120 

AGAAGTTTCT CTTACCAACA TTAACCACAC CATCGGTTAT TATAGTAAAA TGAAATAAGA 15180 

ATAGGGGGGG GGGGGGAGGG GGGGGGAGGG AGA 15213 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6004 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



TTATTACCTG 


AAACATTAAA 


TTTAATTGGA 


CATCCCGTTA 


TCAATTTTAT 


AATATCATCA 


60 


AGATTTTTAT 


TATCTGATTC 


AGGAATTTTA 


TCTGATATAA 


CAACACCATT 


TTCAAGATAG 


120 


TTCATTAAAT 


TATTTGATTC 


ACTAACATTA 


GTGTTTTGAT 


CTCCATCAAG 


CCAAAAATAA 


180 


TGGTTATCGG 


AATCTAAATA 


CGATGAGTTT 


AAAATATTAT 


TACAAATTAT 


TTGATTTGCT 


240 


CCACCAGGAA 


TATATCTCAC 


TACTAAATTC 


TGTTTAAGAT 


TCTCACTACC 


TGAATGAGTG 


300 


ATAACAAACT 


CTAGAATATA 


TTTAGCTAGT 


CTATCTTCAA 


CATAAATCAT 


CTTCCTAGAA 


360 


TGATACACAT 


CACCTAATTC 


AAAAAATGCA 


TCCTGATAAT 


CAATATTTTC 


AATAACATCT 


420 


ACCTTTTCTC 


CGTTTTTCAC 


TAAAAGTTTC 


ACGGCTTCTC 


TAGGAAAATC 


TTTTATAAGT 


480 


TGTGTAGAAT 


GTGTAGTGAT 


AATAATTTGA 


TGTTTTTTAT 


TTAAACACTC 


TTGAAGTAAA 


540 


AACTCTTTAA 


ATTTATAGAT 


TGCACTCGGA 


TGAAGTGAGA 


TTTCAGGTTC 


ATCTATTAAT 


600 


ATTAATGAAT 


TTGATTGCGC 


ATTTACTATA 


TCATTTACTA 


ACAAAATAAT 


TCTAGCCTCA 


660 


CCTGTTCCTG 


CAAAAGCCTC 


GGAATATTCT 


TTTCCAGATT 


TTTTCATCCA AATAGTTTTG 


720 
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GAAGCTTTTA TATCATCACC TTTTGAATAC AACTTATGTG TTAAAATTTG AATGTCTGTA 780 

TAAGATTCAT CCATTATTTC ACTAATAATT TCACAAACTT TATCATCAAC TTTAACATTA 840 

TCTATAACCA TTTCCTTTTT ATAACGCGTA TAGCTACTTG TATTATTCTT TAAAATATCA 900 

GCAACTGGCT TAGATCGTAA TCTTATAAAA TCTTGTTTAC TACGTTGAGT AGAAATTTTT 960 

TTAAAATTAT AGTGATAGAA AAATAAATCA AAAGCAGAAA CATATTCTTT ACAATCACAA 1020 

AAGACAACAT TTTTTTCAAT GCCATCCCAT CTGTCTGTCG AAGAACTTCC AATATATTTA 1080 

TTTTTGGGTA ATCTTTCCAT CTCATATTGT TTTTGAGGAG CATATGGTTC CCAATAATCT 1140 

AATCCTTTTT TTGTTCCAGA ACGGCCTTTA AGAACTTCTA CATTTCTAGA AGCTTTAATG 1200 

TTATAATATG AATAGATTAA ACATTGTTTC CCATCCACTT CATCTATTTG ATCAACATTT 1260 

GTACTAAACC AATATTCAGA CACACTTTTA TTGGCTGGAG AACCATATAA AGCTTGTAAA 1320 

ATTGAAGTTT TATTTACTCC ATATCTATTA CAGACACCTC AGGATTATTT AACTTATAAG 1380 

TTTTAACAGC TACGGAATCA ATTTCAACAG CAACTTGAAC ATCTATGCCT GATTTTTTAA 1440 

GGCCACTTGT AGTGCCACCT GCACCGTTAA ATAAATCAAT AGCAACAATT TTCCCCATAG 1500 

TATTCTCCTA AAGTTTCTCC TTTTTATTAT AACATTATCA AATGTAAAAC CCAACCCGAT 1560 

AGGGTTAGGT TTTTAACATC ATTTCACCAA CTTCTTCATC TCATCAATAC GTGCGACGGT 1620 

CGCGTCATAT TTAGCTTGGT AGTCAGCTTG TTTGTCGCAT TCTTTTTGGA CGACTTCTGG 1680 

TTTGGCGTTG GCTACGAAGC GTTCGTTAGA GAGTTTCTTA CCAACCATGT CCAGTTCTTT 1740 

TTGCCATTTA GCAAGTTCCT TGTCGAGACG GGCCAGTTCT TCTTCAACAT TGAGGAGATC 1800 

GGCCAGTGGC AGGTAGATTT CTGCTCCTGT GATGACACTT GACATAGCCA GTTCAGGTGC 1860 

AGGGATGGTT GATGCGATTT CCAAGTGTTC TGGATTTGTA AAGCGTTTGA TATAGTTGAC 1920 

ATTGCTGTTA AAGAAGGCTT CCAAGTCGCT ATCGCTTGTC TTAACAAGGA TGGTGATAGG 1980 

CTTGCTTGGT GCTACATTTA CTTCCGCACG CGCATTCCGA ACAGCACGAA TCAAGTCTTT 2040 

GAGACTTTCC ACACCAGTGT GAGCCGCAAG GTCTTCAAAG GCTAGATTAA CAGTTGGGTA 2100 

TGCAGCTGTC ACGATAGAAC CTTCTGAGAT TTGTCCAAAG ATTTCCTCTG TCACGAATGG 2160 

CATGATTGGG TGAAGGAGAC GAAGGATCTT GTCCAGCGTA TAGAGGAGAA CAGATCGAGT 2220 

AATGACCTTA TCGTCTTCAT TGTCGCTGTA TAGAACTTCC TTGGTCAACT CAACATACCA 2280 

GTTGGCAAAT TCTTCCCAGA TGAAGTTGTA AAGGATATGA CCAGCCACAC CAAACTCGAA 2340 

CTTATCAAAG TTTTCAGTAA CTTTTGGAAT GGTTTCGTTG AGATTGTGGA GAATCCAGCG 2400 

GTCCGTCACA TTACCAGCCT CACCTGTTGC AACTTTTGTG ACATTGTCAT GCGCCACATC 2460 

CAGCGTCAAA CCTTCATTGT TCATGAGGAT ATAGCGAGAA ATGTTCCAAA TTTTGTTAAT 2520 
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AAAGTTCCAT GAAGCATCCA TTTTCTCGTA AGAGAAACGA ACGTCTTGAC CTGGTGCGGA 


2580 


ACCGTTTGAA AGGAACCAAC GAAGGGCATC AGCACCGTAT TTCTCGATGA CATCCATTGG 


2640 


GTCAATCCCG 


TTACCGAGAG 


ATTTAGACAT CTTGCGTCCT TGCTCGTCAC 


GGATGAGACC 


2700 


GTGGATAAGC 


ACGTTTTGGA 


ATGGCTGACG ACCAGTAAAT TCCAAGGACT 


GGAAGATCAT 


2760 


ACGAGACACC CAGAAGAAGA TGATGTCGTA ACCTGTTACC AAGGTTGAAG TTGGGAAATA 


2820 


ACGTTTAAAG 


TCTTCTGAGT 


CGACTTCAGG CCAGCCCATG GTTGAAAATG 


GCCAGAGGGC 


2880 


AGAACTGAAC 


CAAGTATCCA 


AGACGTCTTC GTCCTGAGTC CATCCGTCAC 


CTTCTGGAGC 


2940 


TTCTTCGCCG 


ACATACATTT 


CACCATCAGC ATTGTACCAG GCAGGGATTT 


GGTGACCCCA 


3000 


CCAAAGCTGA 


CGAGAGATAA 


CCCAGTCGTG GACATTTTCC ATCCATTGAA 


GGAAGGTATC 


3060 


GTTGAAACGA GGTGGGTAGA ATTCGACCTT GTCCTCTGTG TCTTGGTTAG CAATGGCGTT 


3120 


CTTAGCCAAT 


TGGTCCATCT 


TGACGAACCA TTGAGTAGAC AAGCGTGGCT 


CAACTACGAC 


3180 


ACCTGTACGT 


TCTGAGTGAC 


CAACACTGTG GACACGTTTT TCGATTTTGA 


CAAGGGCACC 


3240 


GATTTCTTCC 


AACTTAGCAA 


CGACTGCCTT ACGAGCTTCA AAACGATCCA 


TGCCTGAAAA 


3300 


TTCAAAGGCA 


AGCTCATTGA 


TAGTTCCGTC GTCGTTCATG ACGTTGACTT 


GTGGGAAGTT 


3360 


ATGACGTTGG 


CCAACCAAGA 


AGTCATTTGG ATCGTGGGCA GGTGTGATTT 


TCACGACACC 


3420 


AGTACCAAGC 


TCAGGATCTG 


CGTGCTCATC TCCAACGATT GGGATGAGTT 


TATTAGCGAT 


3480 


TGGAAGGATG 


ACGTTTTTAC 


CAATCAAGTC CTTGTAGCGC GGGTCTTCTG 


GATTAACCGC 


3540 


AACCGCAACG 


TCCCCAAACA 


TAGTCTCAGG ACGAGTTGTA GCAACTTCAA 


GGGCGCGTGA 


3600 


ACCATCTTCC 


AGCATGTAAT 


TCATGTGGTA GAAGGCACCT TCTACATCCT 


TGTGAATCAC 


3660 


CTCAATATCA 


GAAAGGGCTG 


TGCGAGCTGC TGGGTCCCAG TTGATGATAA 


ACTCACGACG 


3720 


ATAGATCCAG 


CCTTTCTTGT 


AAAGGTTCAC AAAGACCTTA CGAACAGCTT 


TTGACAAACC 


3780 


TTCATCAAGA 


GTGAAACGCT 


CACGAGAATA GTCTACAGAA AGCCCCATCT 


TGCGGGATTG 


3840 


TTCCTTGATG 


GTAGTGGCAT 


ATTCGTCTTT GCATTCCGAG ACCTTCGTCA 


AGAAAGACTC 


3900 


ACGACCTAGG 


TCATAACGCG 


TAATACCCTC ACCACGTAAG CGCTCCTCAA 


CCTTAGCCTG 


3960 


AGTCGCAATA 


CCAGCGTGGT 


CCATACCTGG AAGCCAAAGG GTATCAAAGC 


CTTGCATGCG 


4020 


TTTTTGACGG 


ATGATGATAT 


CCTGCAAAGT CGTATCCCAA GCGTGACCAA 


GGTGAAGTTT 


4080 


CCCAGTTACG 


TTTGGTGGTG 


GAATCACGAT TGAATAAGGC TTAGCCTTTT 


GATCGCCTGA 


4140 


AGGCTTGAAA 


ACATCCGCAT 


CAAGCCATTT TTGGTAACGA CCAGCCTCAA 


CCTGGGCTGG 


4200 


ATTGTATTTA 


GGTGAAAGTT 


CTTTAGACAT GTGTGTGTCC TTTCTCTATT TTGTTTATTT 


4260 
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TATTTTGAAT TTGCTTAGCA GCTTCTTCTG CAGACAAATT CGTATTATTT ATTTTAAAGT 4320 

AGTGGTGCAA CTCATTCGGT TGATGTTGGG AATTTAATTG AAGTGTTTCA GCGGTCTCTA 4380 

AAATTTCTCT TTCAGATACC TCAATATGTC GTTTTAAGGG TTTGTGCTTT AATCGATTCT 4440 

CCGTTCGATT TCGACGTATG CACTCTTCAA GACTTGTTTC CAATTCAACA AACAGAATCT 4500 

CTTGATGAAA GTTATCCAAT AAATCCTGAA TTTGCTTTAA ATACATCAGC TGGTACTGAT 4560 

TTGAAAAATC AATTACGTCT GTTAAAATTA CTGATCGCTG ATTTCTTGCA CTTGCTCCAA 4620 

GGAAAGAAAA GGTAATTCCA CGAACAAATT CCCACATCTC CTCGGTATAA TCCTGATAGA 4680 

TCTCTAGTGC AAAATCAATG GCTTGATGGT TATAAAATAG GGTAGCATCC GTCAGTCGAG 4740 

ATAATTCTTG ACCAATGGTC ATTTTTCCTG ATGCTGGAGC ACCAATGATG AAAAGATGCA 4800 

TCAAATCACC TCCCACTCAC TCCTCAGCAA GCCATATCTC AAATCATCAC AGCAGTTGCC 4860 

TTGAGCATCT TTGCGGTCTC TTATGCGAGC TTCGAGGGTA AAGCCAAGCT TTTCCGAGAC 4920 

TCGTTGACTT TGAAGGTTAT ATCCAAAGCA AGTTAGTTCA ATCTTGTGAA GACCAAGTTC 4980 

TTTAAAAGCT AGATCAATCA AGGAACACGC TGCTTCTGGA ACATAACCTC GACCCCAATA 5040 

GTCTGGGTGC AAGGTATAGC CAAGCTCTAG CACATCATCC GCATGAAGAT GGTTGAAGTC 5100 

AACAGAACCA ATGACTTTAT CGGTTCCTTT GACGACAATC CCATAGCCAG CTGGGAGATT 5160 

TTCCTTTTGA GTACGCTCCG GAAGAATGTG CTCCAGATAA TAAATCTCAT CTTCCAAGAT 5220 

CTTGACTGGA GGAAAACCTG CTGGATAGGC GACCTCTGGC AAACTAGCGT AGGTATGGAT 5280 

ATCCTCAGCA TCCACCACTG TGCGGACTCG TAAAACGAGA CGTTCTGTTT CGATTTTATC 5340 

TGGCAGCTCA GTTCTTGCCA TCCTTCTTCC TCGCTTTTTT GATGAAACTG CCCTTCATAT 5400 

CTACACGCTT GTCCAGATAG CGATAAACGC GCTGATATCC ATCTCCCATG AAATAGGTTG 5460 

GGGCAAACAG TTGATTTTTA AAATGTCCCT TTTCATCCAG GAGTTCTGGG GCAACAAGTC 5520 

GCTCAAGAAT CTTGGCAAAG ATGTGGCAAA TACCGTCTTC CTCAACAATC CTATCTACCC 5580 

GACAATCTAA AACAAGTGGA CAGGCGTCTA AAATAGGAGT CTGAGTTCGT TCAGAAATrT 5640 

CATAATGCAC TCCCAAACGT TCCAATTTCT CCTGATGACT GATAAAACCA GCCTGCTCCA 5700 

TCGCAAGCAT AGAAGTTTCA TCAGAAATAT TCACAGTAAA TTTTTGATAC TGTTTGATCT 5760 

GCTCTGCGGC ATTCTCTCTC GCAACGACTC CAATCACAAC CCAATCTCCT AGACTATAAG 5820 

AGGAACTACA GGTCGTGATG TTATAGCCAA AATTCTAATC TTGATATCCT AAAATAAAAA 5880 

CAGGAAAACC ATAATATAGT TTACTTGTGT TAAAAGATTG CTTCATAACA ACCCCCTTTG 5940 

ACTAAGACGT AAAAGAAAAG CCCTGCCATC TACATGACAG GGACGAATGT GTTTATCCGC 6000 

000(8 6004 
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(2) INFORMATION FOR SEQ ID NO: 28: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5857 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



TGTAGAATTC ACGACAATGC 


TTCGTTGATT 


TCTGGGTTGA TTTCGTCGCG 


TTCTGGCAAG 


60 


CGAGTCAATG AACCAAAAAT AGTACACAAT GTGGTATAAT CCTTTTATGG 


CATATTCAAT 


120 


AGATTTTCGT AAAAAAGTTC 


TCTCTTATTG 


TGAGCGAACA GGTAGTATAA 


CAGAAGCATC 


180 


ACACGTTTTC CAAATCTCAC 


GTAATACCAT 


TTATGGCTGG TTAAAGCTAA 


AAGAGAAAAC 


240 


AGGAGAGCTA AACCACCAAG 


TAAAAGGAAC 


AAAACCAAGA AAAGTTGATA 


GAGATAGACT 


300 


TAAAAACTAT CTTACTGACA 


ATCCAGATGC 


TTATTTGACT GAAATAGCTT 


CTGACTTTGG 


360 


CTGTCATCCA ACTACCATCC 


ACTATGCGCT 


CAAAGCTATG GGCTACACTC 


GAAAAAAGAA 


420 


CCACACCTAC TATGAACAAG 


ACCCAGAAAA 


AGTAGCCTTA TTTCTTAAGA 


ATTTTAATAG 


480 


TTTAAAGCAC CTAACACCTG 


TTTAGATTGA 


CGAAACAGGA TTCGATACTT 


ATTTTTATCG 


540 


AGAATATGGT CGCTCATTAA 


AAGGTCAGTT 


AATAAGAGGC AAAGTATCTG 


GAAGAAGATA 


600 


TCAGAGGATT TCTTTGGTTG 


CAGGTCTAAC 


AAATGGTGAG TTAATCGCTC 


CAATGACTTA 


660 


CGAAGAGACG ATGACGAGCG 


ACTTTTTTGA 


AGCTTGGTTT CAGAAGTTTC 


TCTTACCAAC 


720 


ATTAACCACA CCATCGGTTA 


TTATTATGGA 


TAATGCAAGA TTCCATAGAA 


TGGGGAAGCT 


780 


AGAACTCTTG TGTGAAGAGT 


TTGGGTATAA 


ACTTTTACCT CTTCCTCCCT 


ACTCACCTGA 


840 


GTACAATCCT ATTGAGAAAA 


CATGGGCTCA 


TATCAAAAAG CACCTCAAAA 


AGGTATTACC 


900 


AAGTTGCAAT ACCTTTTATG 


AGGCTTTTTT 


GTCTTGTTCT TGTTTCAATT 


GACTATATAA 


960 


ATTGTCTAAG CGAAACAACC GATAAGAATT GGCACAAAAG CGACCGTATT 


TTTGTTACCA 


1020 


ATACAGGAAA AACAGTTCAT 


AGTTCTATCT 


TGAGCAAGTC TCTCCAGCGA 


GCAAACGAAC 


1080 


GCCTTAAAAA ACCAATTCCC 


AAACATCTGT 


CCCCTCACAT CTTCAGACAC 


ACCACTATTA 


1140 


GCATCTTATC AGAAAATAAA ATTCCTTTAA AAACAATCAC GGACAGGGTT GGTCATCCCG 


1200 


ACTCTGAAGT CACTACTTCC 


ATCTACACCC 


ACGTCACAAA GAACATGAAA 


GATGAAGCAA 


1260 


TCAATGTACT GGATAAAGTT 


ATGAAAAAGA 


TTTTTTAAAA AGTTTTGTCC 


CTTTTTTGCC 


1320 


CTCTAAATAC AAAAATAGCC 


CTTCGGATAA AATCCGAGGG GCTAGAAACG 


TTGTTAAATC 


1380 
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AACGGCCGAA 


CTTTTGAATT TCATGGTTCG GGATAAAATA GTTCACTGAA 


CTATTTTATT 


1440 


TTTTAAGGTT 


ATCATAATAT CAAATAGTTC AATTAAATAC GCTAAATTAC 


TAATATACTT 


1500 


TTTACCTTTT 


TCATTCTAAA ATGTAAAGTA CAAACAATTA CAATATACTA 


GAGGGGGAG*T 


1560 


AAAAAAGGTA 


TTAAATCGAT GAGTTCAGCA GGCAAGAAAA TAGCACCTTT ACGGGTGCTA 


1620 


TTTTTTAATT 


AACGCCACGT TAACTTTTGA TTGATGAATT TTATTGTTTG 


GCACTTCTTT 


1680 


CATTTCACGG 


TAAACATCGA TGAAATTCTT TCCAACATTA TTTTTGGAGT 


TAACTGCATT 


1740 


TATTTTTGTA TTAATAACTT TTTTAGTATC GAAAGAATGG TTTAAGAAAT 


CCATAACTAA 


1800 


CTCTCCTTTC TCATCCTGTA ATCAAGATTT TTATCAATGT CAAAATAGTA 


TTTTCTATCA 


1860 


ATCCAAATTG 


GTCCTTCTCC TTTAGAAATA GCAAGTACAT CTACCGGACC 


TCCTACTGTT 


1920 


TCAAGAGTGT 


TGACAATTTT TCTCTTAAAT GAAGTTAATT CAATAAATGT 


TTTAGCTGTA 


1980 


CTCGCCATTT 


CATTAAGTGG TTGCATTCCA ATAAGGTCTA TTATAGGATT 


TATATAATAT 


2040 


TTTTGCTGTA 


TAGATGATAT ATTTTCAAAT ATATTCTCAA TTTCATCACC 


CAATCCATTT 


2100 


TTCTCCATAA 


CTGATGATAC TTGCTCTGCG ATATATACAT TTAAGTTAGG 


ATCTATACCA 


2160 


TTCATAATCG 


TCTCAACCAT CTCTGACTGT GCAAAAGGGA TTATATGACA 


AGTTTTATGA 


2220 


TGATTTATCA 


CACTTTCATT AATAACTTTC CAAATTAATC GTTTAGAAAA AATTCCATAT 


2280 


AATTCAATTT 


GTCTTATAGA TGGAAATATC TCGTCTGTAC CATAACCTGC 


TATAACTAAT 


2340 


CCAGTTATGT 


TTGTTGAGTC ATATCCAATG AAAATCGCTT TATATAAAGA 


TTTAGCAATA 


2400 


ACTTCAACCT 


CATCATCAGT ATGAGGAAAG GATTTAAAAA CATCGTCTAC 


AATGCTTTTT 


2460 


ATTAACTCTA 


ACTCAGCTTC AAAAAATTCA AAATTACTTT CAGCTTCTAC 


TTTTGAAATT 


2520 


TCTAAACTAA 


AATTAGTTAT AGCATTTAAT AAAATTTTAT TAAAATCATC 


TAGAGTGATG 


2580 


GTTTCACCAT 


TAGAAACTCT TAAATCAGCT GTTTCTTGCG CTTCATAGGC 


AATGCTGTCC 


2640 


AAAATACTTC 


TTGTACTTCT GACAATATAA TTTCTTAATA AATCCTCAAC 


TTGTAGATGT 


2700 


TTAAAGGAAA 


TTAAAAATTC TATTAGCTTT TCAACGTATT GGGCAGTATT 


ATCTAATAAA 


2760 


TCTGTGCCAA 


TAGCCTGCTT AAACTCATTT AAAATTACCT CCCACGGAAT 


TTCCATAAAC 


2820 


GAAGCGTTCC 


CATATATCAT GATCCCCACG GAATGTTCTT TTGATAAAGT 


GAATAATTTT 


2880 


CGGGCGCTAT 


TAAAAACTTT TGAATTTTTC CCGTCTGATA AGGTTACAGC 


GCTATCAGAA 


2940 


GCCAATACAA 


CACCATTTTT ATTTAATATT CCAATTTCTG CTGTCAAAAT 


ATCACCTAAA 


3000 


CTTTCTAAAC 


CTGCTCATGC TCTAATGGTA CAACAGCTAA GGTCTTACCA AGACTTGCCA 


3060 


ACACTTTTAA TACTGTATCA AGTTGTGGGC TTGTCTTTCC TGTTTCCATT 


CTAGCGATAA 


3120 


CTGGCTGACT 


AACACCGCTC ATCTCCTCTA GTTTCTTCTG ACTAATACCC TTTTCATTTC 


3180 
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TAGCCTCGAT 


AAGCTCACTC 


ATGATAGCCA 


CGCGCATATC 


ACTTTCCAAA 


ATTTCCTCTT 


3240 


TGCTGAATAA 


TTCAGCTCTT ACATCTTTCC AGTTACTACC 


AATAGCATTA 


TTTTTCATTG 


3300 


TCTAAACCTC 


TTTCTTTTAA 


ATCTGCAAGT 


TCACGTTTAG 


CTTGCTCAAT 


CTCTCTTTTG 


3360 


GGTGTTTTCT 


GTGTCCTTTT 


CATAAAATGA 


TGCAGTAAAA 


CAAAACTACC 


ATCCATCCAA 


3420 


GCAACAAATA 


AAATTCTATC 


TCTAAGTGGT 


CTCAGCTCCC 


AAATTTCAGC 


ATCTAAATGC 


3480 


TTAATATATG 


GTTCGCCTGC 


GCGTGTTCCA 


TGTTGGCTTA 


ACAACTCAAT 


ATAATCATTA 


3540 


ATTTTATTAA 


GCTTAATTCT 


GCTATCTTTC 


CCTTTTTTAC 


TGGTAAGCTC 


TCGCATATAA 


3600 


TCAAAAACAG 


GCTCATTGCC 


GTTTTTATCC 


TTGTAAAAAT 


AGATATTATG 


CACTATTAAC 


3660 


ACCTCTTCCT 


AATAACAATT 


ATAACCTAAA AGTTATTGTT 


TGTAAATACT 


TTTAAGTTAT 


3720 


TAAAATAAAA 


AGCACCTAGT 


TTCCTAGATG 


CTAGCACAAT 


GACACGGATT 


CGCACCGTGG 


3780 


CTACCTCTAT 


CAAGGTGTAC 


TCCTTCTATA 


CTATCCCTTG 


TGCTTTAGAA 


TATTATACCA 


3840 


CACAATCAAC 


TAGATACCTA 


CCATCTCATG 


ATATACCCGC 


ATTTTGGGCA 


AGGGTACAAC 


3900 


GCTAAAATAC 


AAATCAGAAT 


AGATATTAAA 


CCACTTATTT 


AACTTATCAT 


AAGCTGGTGA 


3960 


TTGACTGATA 


AATAATATCC 


GCTGACAAGC 


TCCGATAACA 


TTCATGTGAT 


TGTACACATA 


4020 


AACCTCTTTT 


ACAGCCTCTA 


AAATGTCAGC 


CTCACTTGTT 


TGTACCCTAA 


TATCTGTTAT 


4080 


CTGCTTGATA 


GTTGCGTATT 


TTTGATAAGC 


TAGCATATCT 


TGATTTTTAG 


CAGCATCAAA 


4140 


CATTTTACGC 


TCAAGGACAC 


TATACTTAGG 


TTGTTCTTTA 


TCTCGCATGA 


AATACCACTT 


4200 


GAGCCATAAA 


ATCTTTTCTC 


GGTGTATTAC 


AGAAATACGC 


TCAATTTTCT 


TCTTTGTGAT 


4260 


TGCTACCTCC 


TAAATCATCA 


ATTTAACAAT 


TCTAACCACT 


CACTTTTAGA 


AATAGTTGCA 


4320 


TAGATCTTGT 


TCGATGTATG 


ATACAAAGGT 


TCTAAATCTT 


TTTCGACCCT 


AATATAGTTC 


4380 


ATCTTATCCT 


CATGAGTAGG 


AAAGTATAGT 


ATTTCCGTTT 


CATCCTCGTT 


TAGGATACGA 


4440 


TTGCACCAAT 


CATCAATAAT 


AACTGGCACT 


TCCCACTCAC 


GCCATTTTTT 


AAGGTTTTCT 


4500 


AAAAGTTCAT 


TATCACTAAA 


TAGCTCGCCA 


TCTATTTGGA 


AAAATTCCCC 


TAAGTCATTG 


4560 


TTTCCTTCAA 


CAATAATAAA 


CTCTGGCATA 


TTTCTATTAC 


TTAATAACTC 


CTTGAGTTCT 


4620 


TGTAACTCTT 


TGATTTCCTT 


TAGATACTTC 


CTCAATTTCC 


AACCTCAATT CTTCAATCTG 


4680 


CCTTACTACT 


CCAAAAATTT 


CATGGGTCTT 


ATAAGATTGT 


TCAAGTATAG 


CCTTTGCTGC 


. 4740 


TTGAGTTCTT 


ATAAACGGGT 


TGACCTTACT 


GTCCATGATA 


ATATGATTGA 


GTACAGAAAC 


4800 


AGCGTTAGAT 


GATGCTAAAT 


AAAGCATTTG 


AGTTGTTTTA 


TCCATCATCT 

/ 


CATCTTGCTT 


4860 


TATCCTCAAT 


GTCTTTTTAA 


GCGCTGGAAC 


TTTTAGATAC 


TTATGACCTG 


TTGCGCGTGA 


4920 
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TACCCCTGCT TTTTGACATG CTTTGTCTAT CGTTGGCTCG GTAAGCATGG CATCTATGAA 4980 

TTTAATTTGC TTGGACGTAA GGTTATCATT TTCATTTCCT GCCATCTATT ACCTCCTCAT 5040 

TATCAAAATA AAGGGTTGCC CCTTTATTTC CCTATGCTAG ATAATTCTGC AATTCTGCAT 5100 

CCATTGCCTC TGAATTGCCC TCAACAATCA TTTCATGCTG TACTAAATCA ATCTTATCTC 5160 

CGTTAATAAG TAAACCACCG TGGAAATAAT CAATTTTTCT ATCAAGGAAA TGTACTAGCT 5220 

TTTCAAGGCG TTGCTGTTGG CTGAATTGCT CCATGTCAAT TTCGATATAA GCAAGGGTAG 5280 

TATCATTATC CATAATATCT TCTAATTTTC TAAGAGCTAG AGGTTTATTT TTATATTTTT 5340 

CTAGGTATTC TCTCATTTCT GCCACTGTTA ATTTGATACT AGATAATAAA CTTAGTTCAG 5400 

CTGCATCATC TGCTGTAATA GGCTCTTCTT TTGATTCATG GTTTGCTAGT TCAGCATTTT 5460 

TCTCTTTTTC TAGTTGCTGA TACAATAGCT GAGCAGTATT TTGGGAATAG TTTTCGCCCT 5520- 

CTTTTTTATA TTTTAAAAGT TCTTGCTCTG CATACACTTT CCCGATAATG ACTTCCTTAT 5580 

AAACTAATTG CCCATCTTGA GCTTTTAGCT TAATACTCCC ATGCTCTGGA ATTTCAATAT 5640 

ACTTAATTAT ACCATTTTTT GAGTATAAAA CAAAGCCTTT CTCCATCATT TTTAATAATT 5700 

TATCATCCTT GTTTTCAGTC ATGCTTTTCT CCTTTATTTC ATTTTATTAT AATCTGAATA 5760 

CCCCTAGTCT ATTTATTTCA CTAGGTTTTT AGGGTTCGTA TGCTAAAATA CTACCCTTTT 5820 

TGTGTACCTT ATGGCTGACT TTTCAAATTG GTTAGTT 5357 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10254 base pairs 
{B) TYPE: nucleic acid 
tC) STRANDEDNESS: double 
(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

AAAATGATAG CAGGAGAGTT TTCCCGTCCA TCAGACCCAG AACTGAGAGC CTTAGCTCAG 60 

GCTTCTCGCC AAAAACAGGC CGCCTTTAAC AAGGAAGAGA ACCCCTTGAA GGGAGCCGAA 120 

ATCATCAAGA CTTGGTTTGC CTCAACCGGG AAAAATCTTT ACATCAACAC TCGCTTGATG 180 

GTGGACTACG GTGTCAACAT CCATCTAGGG GAAAATTTTT ATTCTAATTG GAACTTGACC 240 

ATGCTGGATA TCTGTCCCAT TCGTATCGGG GACAATGCTA TGATTGGTCC TAATTGTCAG 300 

TTTTTGACAC CCCTCCATCC ACTAGATCCA CAGGAACGCA ATTCAGGTAT CGAGTACGGA 360 

AAGCCTATCA CAATCGGAGA TAATTTCTGG ACTGGTGGTG GCGTCATTGT CCTTCCTGGA 420 

GTGACACTGG GAAATAATGT CGTTGCAGGA GCAGGGGCAG TAATTACCAA ATCTTTTGGC 480 
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GACAACGTTG 


TCCTAGCTGG 


CAATCCTGCG CGCGTGATTA AGGAAATACC TGTTAAATAG 


540 


AAGTAAAAAG 


GAACAGCTGG 


GGTTGTTTCT TTTTTGTAGG TTTCATCATT TTTTACCCAG 


600 


TTCACATTTA 


CCTACTCTAT 


CTCTTAGCAA GTCTGTTTCA TTAAGCAAGT TCAAAGCATC 


660 


TCGTAAGTGG 


GATGTTTTTC 


TCCTCAGTTC ATCAGCTTCC TCCTTGACAC TCGGTCAGAT 


720 


TTTGATACAA 


TAGTACAAAA 


TTAGAGGAGG CAGGCTATGA TTCAGAAACA TGCGATTCCT 


780 


ATTTTAGAGT 


TTGATGACAA 


TCCTCAGGCG GTTATCATGC CCAATCACGA GGGGCTGGAC 


840 


TTGCAGTTGC 


CAAAGAAGTG 


TGTTTATGCA TTTTTAGGTG AGGAGATTGA CCGCTATGCG 


900 


AGGGAAGTAG 


GGGCGAACTG 


TGTTGGCGAA TTTGTTTCTG CCACCAAGAC CTATCCAGTT 


960 


TATGTCGTGA ACTACAAGGA 


CGAGGAGGTC TGTCTGGCTC AGGCTCCTGT TGGCTCCGCT 


1020 


CCAGCAGCCC 


AGTTTATGGA 


TTGGTTGATT GGCTATGGTG TGGAGCAGAT TATCTCTACT 


1080 


GGGACCTGTG 


GTGTCCTAGC 


TGATATAGAG GAAAATGCCT TTCTAGTCCC TGTTCGCGCT 


1140 


CTGCGAGATG 


AAGGAGCCAG 


TTACCACTAT GTGGCACCTT GTCGTTATAT GGAAATGCAG 


1200 


CCAGAGGCTA 


TTGCTGCTAT 


TGAGGAAGTT TTGGAAGACA GAGGGATTCC TTATGAAGAA 


1260 


GTCATGACCT 


GGACGACAGA 


CGGTTTTTAC CGAGAAACGG CTGAAAAGGT GGCTTATCGT 


1320 


AAGGAAGAAG 


GCTGTGCTGT 


TGTGGAGATG GAGTGTTCTG CTCTTGCGGC AGTAGCTCAA 


1380 


TTGCGTGGGG 


TTCTCTGGGG 


TGAATTGTTG TTCACAGCAG ATTCTCTAGC GGACTTGGAG 


1440 


CAGTACGACA 


GTCGTGACTG 


GGGCTCGGAA GCTTTTAATA AGGCGCTAGA ACTGAGTTTA 


1500 


GCAAGTGTTC 


ACCACCTTTA 


GTTGTACTGG CAAAGGATTT GTTTTATCAT AAAATGTCTA 


1560 


GCTCATACTT 


TTCAAAAATA 


TGTTTAAACG AGGTCACCTT CCTCTTGTCC TAGGCATGTT 


1620 


GAGGTTGGGA 


AAAATCTTTA 


AAATCAGAAA AACGTATCAT ATCAGGTGAT GAAAACTTTG 


1680 


ACACTATGCG 


TTTTATGTCG 


ATAAGATTTA GAGTGAGATG AAATGATACT CTTCGAAAAT 


1740 


CTCTTCAAAC 


CAGGTCAGCT 


TCAGCTTGCC GTAGGTATAT GTTACTGACT TCGTCAGTCT 


1800 


TATCCGGCAA 


CCTCAAAACG 


GTGTTTTGAG CTGACTTCGT CAGTTCTATT TGCAACCTCA 


1860 


AAACAGTGTT 


TTGAGCAACC 


TGTGACTAGC TTTCTAATCG ATGCCTTGGT TTTCATTGCC 


1920 


TATAATCAAA 


AAGAGAAATT 


TTCTCCTGAA AAGCATATAG AGTAGCTGGC GTTAAAAGCT 


1980 


CCTGTCTTGC 


TTTTTTGACC 


TATAGTCACA TCTATCAAGT ATTGTTCTTG CCTAAGCTAT 


2040 


CAATAAAAAG 


GTGGCATTTT 


TTAGGCTTGG TGTTAGTAGA TTTTGCCTTA TCCTATCTAA 


2100 


GTCATTTCGA 


ACTTTTTATG 


GTACAATGGA AACATGTTAT TCAAATTATC TAAGGAAAAA 


2160 


ATAGAGCTAG 


GCTTATCTCG 


TTTATCGCCA GCCCGTCGTA TTTTTTTGAG TTTTGCCTTG 


2220 
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GTCATTTTAC TAGGCTCTCT TCTTTTGAGC TTGCCCTTTG TCCAAGTTGA AAGCTCACGA 2280 

GCGACTTATT TTGATCATCT TTTCACTGCT GTCTCTGCAG TCTGTGTGAC GGGTCTCTCA 2340 

ACCCTTCCAG TAGCTCACAC CTATAATATC TGGGGTCAAA TAATCTGTTT GCTCTTGATT 2400 

CAGATCGGTG GTCTAGGGCT CATGACCTTT ATTGGGGTTT TCTATATCCA GAGCAAGCAA 2460 

AAGCTTAGTC TTCGTAGCCG TGCAACTATT CAGGATAGTT TTAGTTATGG AGAAACTCGA 2520 

TCTTTGAGAA AGTTTGTCTA TTCTATTTTT CTCACGACCT TTTTGGTTGA GAGCTTGGGA 2580 

GCTATTTTGC TTAGTTTTCG CCTTATTCCT CAACTTGGCT GGGGACGTGG TCTTTTTAGT 2640 

TCCATTTTTC TAGCGATCTC AGCCTTCTGT AATGCCGGTT TTGATAATTT AGGGAGCACC 2700 

AGTTTATTTG CTTTTCAGAC CGATTTACTG GTCAATCTGG TGATTGCAGG CTTGATTATT 2760 

ACAGGCGGCC TTGGTTTTAT GGTCTGGTTT GATTTGGCTG GTCATGTAGG AAGAAAGAAA 2820 

AAAGGACGTC TGCACTTTCA TACGAAGCTT GTACTATTAT TGACTATAGG TTTGTTGTTA 2880 

TTTGGAACAG CAACTACTCT CTTTCTTGAG TGGAACAATG CTGGAACGAT TGGCAATCTC 2940 

CCTGTTGCCG ATAAGGTTTT AGTTAGCTTT TTTCAAACAG TGACGATGCG AACAGCTGGC 3000 

TTTTCTACGA TAGATTATAC TCAGGCTCAT CCTGTGACTC TTTTGATTTA TATCTTACAG 3060 

ATGTTTCTAG GTGGGGCACC TGGAGGAACA GCTGGGGGAC TCAAGATTAC GACATTTTTT 3120 

GTCCTCTTGG TCTTTGCACG AAGTGAGCTT CTAGGCTTGC CTCATGCCAA TGTTGCGAGA 3180 

CGAACGATCG CGCCGCGAAC GGTTCAAAAA TCCTTTAGTG TCTTTATTAT CTTTTTGATG 3240 

AGCTTCTTGA TAGGATTGAT TCTGCTAGGG ATAACAGCCA AAGGCAATCC TCCCTTTATC 3 300 

CACCTCGTAT TTGAAACCAT TTCAGCTCTT AGTACAGTTG GTGTAACGGC AAATCTGACT 3360 

CCTGACCTTG GGAAATTGGC TCTCAGTGTT ATCATGCCAC TTATGTTTAT GGGACGAATT 3420 

GGTCCCTTGA CCTTGTTTGT TAGCTTGGCA GATTACCATC CAGAAAAGAA AGATATGATT 3480 

CACTATATGA AAGCAGATAT TAGTATTGGT TAAGAAAGGA AAGAGCATGT CAGATCGTAC 3540 

GATTGGAATT TTGGGCTTGG GAATTTTTGG GAGCAGTGTC CTAGCTGCCC TAGCCAAGCA 3600 

GGATATGAAT ATTATCGCTA TTGATGACCA CGCAGAGCGC ATCAATCAGT TTGAGCCAGT 3660 

TTTGGCGCGT GGAGTGATTG GTGACATCAC AGATGAAGAA TTATTGAGAT CAGCAGGGAT 3720 

TGATACCTGC GATACCGTTG TAGTCGCGAC AGGTGAAAAT CTGGAGTCGA GTGTGCTTGC 3780 

GGTTATGCAC TGTAAGAGTT TGGGGGTACC GACTGTTATT GCTAAGGTCA AAAGTCAGAC 3840 

CGCTAAGAAA GTGCTAGAAA AGATTGGAGC TGACTCGGTT ATCTCGCCAG AGTATGAAAT 3900 

GGGGCAGTCT CTAGCACAGA CCATTCTTTT CCATAATAGT GTTGATGTCT TTCAGTTGGA 3960 

TAAAAATGTG TCTATCGTGG AGATGAAAAT TCCTCAGTCT TGGGCAGGTC AAAGTCTGAG 4020 
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TAAATTAGAC CTCCGTGGCA AATACAATCT GAATATTTTG GGTTTCCGAG AGCAGGAAAA 4080 

TTCCCCATTG GATGTTGAAT TTGGACCAGA TGACCTCTTG AAAGCAGATA CCTATATTTT 4140 

GGCAGTCATC AACAACCAGT ATTTGGATAC CCTAGTAGCA TTGAATTCGT AAAGAGGGAT 4200 

GACCCCTCTT TTTTGATGCC TAAGATGGCA AATAGAGACA GAAGCCCCTT GTCTTCTAGT 4260 

AAAAGTTCTT CAAAGGCTGG ACTTTATGGT AAAATAGAAA GAAGTGACAA GAGAGAGTAA 4320 

TACTCAATGA AAATCAAAGA TCAAACTAGG AAACTAGCTA CGGGCTGCTC AAAACACTGT 4380 

TTTGAGGTTG CAGATAGAAC TGACGAAGTC AGTAACATCT ATACGGCAAG GCGACGTTGA 4440 

CGCGGTTTGA AGAGATTTTC GAAGAGTATA AGAAAAAATC AGTCCCCTAA AGGAGTAGAT 4500 

TATGAAGTTA TTGTCTATCG CAATTTCTAG CTATAATGCA GCAGCCTATC TTCATTACTG 4560 

TGTGGAGTCG CTAGTGATTG GTGGTGAGCA AGTTGGGATT TTGATTATCA ATGACGGGTC 4620 

TCAGGATCAG ACTCAGGAAA TCGCTGAGTG TTTAGCTAGC AAGTATCCTA ATATCGTTAG 4680 

AGCCATCTAT CAGGAAAATA AATGCCATGG CGGTGCGGTC AATCGTGGCT TGGTAGAGGC 4740 

TTCTGGGCGC TATTTTAAAG TAGTTGACAG TGATGACTGG GTGGATCCTC GTGCCTACTT 4800 

GAAAATTCTT GAAACCTTGC AGGAACTTGA GAGCAAAGGT CAAGAGGTGG ATGTCTTTGT 4860 

GACCAATTTT GTCTATGAAA AGGAAGGGCA GTCTCGTAAG AAGAGTATGA GTTACGATTC 4920 

AGTCTTGCCT GTTCGGCAGA TTTTTGGCTG GGACCAGGTC GGAAATTTCT CCAAAGGCCA 4980 

GTATACCATG ATGCACTCGC TGATTTATCG GACAGATTTG TTGCGTGCTA GCCAGTTCTA 5040 

ACTGCCTGAA CATAGTTTTT ATGTCGATAA TCTCTTTGTC TTTACGCCCG TTCAGCAGGT 5100 

CAAGACCATG TACTATCTGC CTGTCGATTT CTATCGTTAT TTGATTGGGC GTGAGGACCA 5160 

GTCTGTCAAT GAGCAAGTGA TGATTAAGTG CATTGACCAG CAACTCAAGG TCAATCGACT 5220 

CTTGATAGAC CAACTTGATT TGTCCCAAGT GAGTCATCCC AAAATGCGAG AATATCTGCT 5280 

GAATCATATT GAACTCACGA CGGTGATTTC CAGTACCCTG CTCAACCGAT CTGGAACAGC 5340 

GGAGCATCTG GCAAAAAAAC GCCAATTGTG GACCTATATT CAGCAGAAAA ATCCAGAAGT 5400 

CTTTCAGGCT ATTCGTAAGA CCATGTTGAG CCGTTTGACC AAACATTCTG TCTTGCCAGA 5460 

TCGCAAACTG TCCAATGTCG TCTATCAAAT CACCAAATCT GTTTATGGAT TTAATTAATA 5520 

TAAGTGTTTT ATAAGAGGGA TTTAAGAAAA ATTTTAACTT TTTCTTAGTC CTTTTTAATT 5580 

TCAGGAGATT ATACTAGAGT CATCAAATAA AGAAAGACTC TAAGGAGAAT CCTATGAAAT 5640 

TCAATCCAAA TCAAAGATAT ACTCGTTGGT CTATTCGGGG TCTCAGTGTC GGTGTTGCGT 5700 

CAGTTGTTGT GGCTAGTGGC TTCTTTGTCC TAGTTGGTCA GCCAAGTTCT GTACGTGCCG 5760 
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ATGGGCTCAA TCCAACCCCA GGTCAAGTCT TACCTGAAGA GACATCGGGA ACGAAAGAGG 5820 

GTGACTTATC AGAAAAACCA GGAGACACCG TTCTCACTCA AGCGAAACCT GAGGGCGTTA 5880 

CTGGAAATAC GAATTCACTT CCGACACCTA CAGAAAGAAC TGAAGTGAGC GAGGAAACAA 5940 

GCCCTTCTAG TCTGGATACA CTTTTTGAAA AAGATGAAGA AGCTCAAAAA AATCCAGAGC 6000 

TAACAGATGT CTTAAAAGAA ACTGTAGATA CAGCTGATGT GGATGGGACA CAAGCAAGTC 6060 

^ CAGCAGAAAC TACTCCTGAA CAAGTAAAAG GTGGAGTGAA AGAAAATACA AAAGACAGCA 6120 

TCGATGTTCC TGCTGCTTAT CTTGAAAAAG CTGAAGGGAA AGGTCCTTTC ACTGCCGGTG 6180 

TAAACCAAGT AATTCCTTAT GAACTATTCG CTGGTGATGG TATGTTAACT CGTCTATTAC 6240 

TAAAAGCTTC GGATAATGCT CCTTGGTCTG ACAATGGTAC TGCTAAAAAT CCTGCTTTAC 6300 

CTCCTCTTGA AGGATTAACA AAAGGGAAAT ACTTCTATGA AGTAGACTTA AATGGCAATA 6360 

CTGTTGGTAA ACAAGGTCAA GCTTTAATTG ATCAACTTCG CGCTAATGGT ACTCAAACTT 6420 

ATAAAGCTAC TGTTAAAGTT TACGGAAATA AAGACGGTAA AGCTGACTTG ACTAATCTAG 6480 

TTGCTACTAA AAATGTAGAC ATCAACATCA ATGGATTAGT TGCTAAAGAA ACAGTTCAAA 6540 

AAGCCGTTGC AGACAACGTT AAAGACAGTA TCGATGTTCC AGCAGCCTAC CTAGAAAAAG 6600 

CCAAGGGTGA AGGTCCATTC ACAGCAGGTG TCAACCATGT GATTCCATAC GAACTCTTCG 6660 

CAGGTGATGG CATGTTGACT CGTCTCTTGC TCAAGGCATC TGACAAGGCA CCATGGTCAG 6720 

ATAACGGCGA CGCTAAAAAC CCAGCCCTAT CTCCACTAGG CGAAAACGTG AAGACCAAAG 6780 

GTCAATACTT CTATCAAGTA GCCTTGGACG GAAATGTAGC TGGCAAAGAA AAACAAGCGC 6840 

TCATTGACCA GTTCCGAGCA AAyGGTACTC AAACTTACAG CGCTACAGTC AATGTCTATG 6900 

GTAACAAAGA CGGTAAACCA GACTTGGACA ACATCGTAGC AACTAAAAAA GTCACTATTA 6960 

ACATAAACGG TTTAATTTCT AAAGAAACAG TTCAAAAAGC CGTTGCAGAC AACGTTAAAG 7020 

ACAGTATCGA TGTTCCAGCA GCCTACCTAG AAAAAGCCAA GGGTGAAGGT CCATTCACAG 7080 

CAGGTGTCAA CCATGTGATT CCATACGAAC TCTTCGCAGG TGATGGTATG TTGACTCGTC 7140 

TCTTGCTCAA GGCATCTGAC AAGGCACCAT GGTCAGATAA CGGTGACGCT AAAAACCCAG 7200 

CCCTATCTCC ACTAGGTGAA AACGTGAAGA CCAAAGGTCA ATACTTCTAT CAATTAGCCT 7260 

TGGACGGAAA TGTAGCTGGC AAAGAAAAAC AAGCGCTCAT TGACCAGTTC CGAGCAAACG 7320 

GTACTCAAAC TTACAGCGCT ACAGTCAATG TCTATGGTAA CAAAGACGGT AAACCAGACT 7380 

TGGACAACAT CGTAGCAACT AAAAAAGTCA CTATTAACAT AAACGGTTTA ATTTCTAAAG 7440 

AAACAGTTCA AAAAGCCGTT GCAGACAACG TTAAGGACAG TATCGATGTT CCAGCAGCCT 7500 

ACCTAGAAAA GGCCAAGGGT GAAGGTCCAT TCACAGCAGG TGTCAACCAT GTGATTCCAT 7560 
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ACGAACTCTT 


CGCAGGTGAT 


GGCATGTTGA 


CTCGTCTCTT 


GCTCAAGGCA 


TCTGACAAGG 


7620 


CACCATGGTC 


AGATAACGGC 


GACGCTAAAA 


ACCCAGCTCT 


ATCTCCACTA 


GGTGAAAACG 


7680 


TGAAGACCAA AGGTCAATAC 


TTCTATCAAG 


TAGCCTTGGA 


CGGAAATGTA 


GCTGGCAAAG 


7740 


AAAAACAAGC 


GCTCATTGAC 


CAGTTCCGAG 


CAAACGGTAC 


TCAAACTTAC 


AGCGCTACAG 


7800 


TCAATGTCTA 


TGGTAACAAA 


GACGGTAAAC 


CAGACTTGGA 


CAACATCGTA 


GCAACTAAAA 


7860 


AAGTCACTAT 


TAAGATAAAT 


GTTAAAGAAA 


CATCAGACAC 


AGCAAATGGT 


TCATTATCAC 


7920 


CTTCTAACTC 


TGGTTCTGGC 


GTGACTCCGA 


TGAATCACAA 


TCATGCTACA 


GGTACTACAG 


7980 


ATAGCATGCC 


TGCTGACACC 


ATGACAAGTT 


CTACCAACAC 


GATGGGAGGT 


GAAAACATGG 


8040 


CTGCTTCTGC 


TAACAAGATG 


TCTGATACGA 


TGATGTCAGA 


GGATAAAGCT 


ATGCTACCAA 


8100 


ATACTGGTGA 


GACTCAAACA 


TCAATGGCAA 


GTATTGGTTT 


CCTTGGGCTT 


GCGCTTGCAG 


8160 


GTTTACTCGG 


TGGTCTAGGT 


TTGAAAAACA 


AAAAAGAAGA 


AAACTAATCA 


GCTAAGGAAA 


8220 


TAAATGATGG 


ATAGTGGGCT 


GACTAAGATT 


AGTTTAACAA 


CTCAATCAGC 


AATCAGGACT 


8280 


TTCTTTCAAT AGCAGATTAA AATCATCGTA AAACAATAAA AATAGTGTTA TACTTAAAGC 


8340 


AGTATAGCAC TGTTTTTATC AAAGGAGAGA CAGATGGGAA AGACAATTTT ACTCGTTGAG 


8400 


GACGAGGTAG AAATCACAGA TATTCATCAG AGATACTTAA TTCAGGCAGG TTATCAGGTC 


8460 


TTGGTAGCCC 


ATGATGGACT 


GGAAGCGCTA 


GAGCTGTTCA 


AGAAAAAACC 


GATTGATTTG 


8520 


ATTATCACAG 


ATGTCATGAT 


GCCTCGGATG 


GATGGTTATG 


ATTTAATCAG 


TGAGGTTCAA 


8580 


TACTTATCAC 


CAGAGCAGCC 


TTTCCTATTT 


ATTACTGCTA 


AGACCAGTGA 


ACAGGACAAG 


8640 


ATTTACGGCC 


TGAGCTTGGG 


AGCAGATGAT 


TTTATTGCTA 


AGCCTTTTAG 


CCCACGTGAG 


8700 


CTGGTTTTGC 


GTGTCCACAA 


TATTTTGCGC 


CGCCTTCATC 


GTGGGGGCGA 


AACAGAGCTG 


8760 


ATTTCCCTTG 


GCAATCTAAA 


AATGAATCAT 


AGTAGTCATG 


AAGTTCAAAT 


AGGAGAAGAA 


8820 


ATGCTGGATT 


TAACTGTTAA 


ATCATTTGAA 


TTGCTGTGGA 


TTTTAGCTAG 


TAATCCAGAG 


8880 


CGAGTTTTCT 


CCAAGACAGA 


CCTCTATGAA 


AAGATCTGGA 


AAGAAGACTA 


CGTGGATGAG 


8940 


ACCAATACCT 


TGAATGTGCA 


TATCCATGCT 


CTTCGACAGG 


AGCTGGCAAA 


ATATAGTAGT 


9000 


GACCAAACTC 


CCACTATTAA 


GACAGTTTGG 


GGGTTGGGAT 


ATAAGATAGA 


GAAACCGAGA 


9060 


GGACAAACAT 


GAAACTAAAA 


AGTTATATTT 


TGGTTGGATA 


TATTATTTCA 


ACCCTCTTAA 


9120 


CCATTTTGGT 


TGTTTTTTGG 


GCTGTTCAAA 


AAATGCTGAT 


TGCGAAAGGC 


GAGATTTACT 


9180 


TTTTGCTTGG 


GATGACCATC 


GTTGCCAGCC 


TTGTCGGTGC 


TGGGATTAGT 


CTGTTTCTCC 


9240 


TATTGCCAGT 


CTTTACGTCG 


TTGGGCAAAC 


TCAAGGAGCA 


TGCCAAGCGG 


GTAGCGGCGA 


9300 
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AGGATTTTCC TTCAAATTTG GAGGTTCAAG GTCCTGTAGA ATTTCAGCAA TTAGGGCAAA 
CTTTTAATGA GATGTCCCAT GATTTGCAGG TAAGCTTTGA TTCCTTGGAA GAAAGCGAAC 
GAGAAAAGGG CTTGATGATT GCCCAGTTGT CGCATGATAT TAAGACTCCT ATCACTTCGA 
TCCAAGCGAC GGTAGAAGGG ATTTTGGATG GGATTATCAA GGAGTCGGAG CAAGCTCATT 
ATCTAGCAAC CATTGGACGC CAGACGGAGA GGCTCAATAA ACTGGTTGAG GAGTTGAATT 
TTTTGACCCT AAACACAGCT AGAAATCAGG TGGAAACTAC CAGTAAAGAC AGTATTTTTC 
TGGACAAGCT CTTAATTGAG TGCATGAGTG AATTTCAGTT TTTGATTGAG CAGGAGAGAA 
GAGATGTCCA CTTGCAGGTA ATCCCAGAGT CTGCCCGGAT TGAGGGAGAT TATGCTAAGC 
TTTCTCGTAT CTTGGTGAAT CTGGTCGATA ACGCTTTTAA ATATTCTGCT CCAGGAACCA 
AGCTGGAAGT GGTGGCTAAG CTGGAGAAGG ACCAGCTTTC AATCAGTGTG ACCGATGAAG 
GGCAGGGTAT TGCCCCAGAG GATTTGGAAA ATATTTTCAA ACGCCTTTAT CGTGTCGAAA 
CTTCGCGTAA CATGAAGACA GGTGGTCATG GATTAGGACT TGCGATTGCG CGTGAATTGG 
CCCATCAATT GGGTGGGGAA ATCACAGTCA GCAGCCAGTA CGGTCTAGGA AGTACCTTTA 
CCCTCGTTCT CAACCTCTCT GGTAGTGAAA ATAAAGCCTA AAACCCCTTT ACAAATCCAG 
CTATTCATGG TAGAATAGAT TTTGTGTGAA ATATCAGCAG GAAAGCATGA AGCTCGTCAA 
CAGGTGTCTT ATGACAAGTA ACCTTGGCTG TTTAGGCGAA GGGCATCTGC ACGG 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9769 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10254 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

CCGGCGACTA TCGATAACAC TTGACTTGGT AGCCCCACAT TTTGGACAAC GCATCCTTTC 60 

CCTCCTTATC GTTTTCTTTT CATTATACCA TTTTTTAAGC GATTCCCAAA ACAATTCTTC 120 

TTTTTGCTTG ACAAGTTTTT TGTTTTGTTG TATTATTTAA TTAAGACAAC AAGGTAAAAG 180 

AAAGGAGACT AAGATGTCCT GGACATTTGA CAACAAAAAA CCCATCTATT TACAGATTAT 240 

GGAGAAAATC AAGCTTCAGA TTGTTTCCCA TACACTGGAA CCCAATCAAC AACTTCCAAC 300 

CGTGAGGAGC TAGCTAGCGA GGCTGGTGTC AATCCCAATA CCATCCAAAG AGCCTTATCA 360 

GACCTTGAAC GAGAAGGATT TGTCTACAGC AAGCGAACAA CTGGACGATT TGTGACTAAG 420 

GATAAGGAGC TAATCGCCCA GTCACGCAAA CAATTATCAG AAGAAGAATT GGAACACTTC 480 
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GTTTCCTCCA TGACCCATTT 


TGGCTATGAA 


AAAGAAGAAC 


TACCAGGCGT 


AGTCAGTGAT 


540 


TATATTAAAG GAGTTTAAGC 


CTATGTCATT 


ACTAGTATTT 


GAAAATGTAT 


CCAAATCATA 


600 


TGGAGCAACA CCAGCCCTTG 


AAAATGTTTC 


TCTTGACATT 


CCAGCTGGAA 


AAATTGTCGG 


660 


CCTTCTTGGG CCAAACGGCT 


CAGGAAAAAC 


AACCCTGATT 


AAACTAATTA 


ATGGCCTCTT 


720 


ACAACCAGAT CAAGGACGTG 


TCCTCATCAA 


CGACATGGAC 


CCAAGCCGAG 


CAACCAAGGG 


780 


CGTTGTAGCT TATTTGCCTG 


ATACGACCTA 


TCTCAATGAG 


CAAATGAAGG 


TCAAAGAAGG 


840 


CCTAACCTAC TTCAAGACCT 


TCTATAAAGA TTGTCAGATC 


TTGAACGCGC 


CCATCATCTA 


900 


CTTGCAGACC TGGGCATTGA TGAAAATAGT CGTCTCAAGA AACTATCAAA AGGAAAGAAA 


960 


GAAAAGGTTC AACTGATTTT GGTTATGAGC CGTGATGCTC GTCTCTATGT TTTGGACGAA 


1020 


CCCATTGGTG GGGTGGATCC 


AGCAGCCCGT 


GCTTATATCC 


TCAATACCAT 


TATCAACAAC 


1080 


TACTCACCAA CTTCTACCGT 


TTTGATTTCT 


ACCCACTTGA 


TTTCTGATAT 


CGAGCCAATG 


1140 


TTGGATGAAA TTGTCTTCCT 


AAAAGACGGA AAAGTCGTCC 


GTCAAGGAAA 


TGTAGATGAT 


1200 


ATTCGCTACG AGTCAGGTGA 


ATCCATTGAC 


CAACTCTTCC 


GTCAGaATTT 


AAGGCCTAAG 


1260 


CAAAGGAGAT TATTTATGTT 


TTGGAATTTA 


GTTCGCTACG 


AATTTAAAAA 


TGTTAACAAG 


1320 


TGGTATTTAG CCCTCTACGC 


AGCCGTGCTA 


GTCCTTTCTG 


CCCTCATCGG 


AATACAGACA 


1380 


CAAGGCTTTA AAAATCTACC 


TTACCAAGAA 


AGTCAGGCTA 


CTATGGTACT 


TTTTCTAGCT 


1440 


ACAGTCTTTG GTGGCTTGAT 


GCTTACACTT 


GGGATTTCAA 


CCATTTTCTT 


GATTATTAAA 


1500 


CGCTTCAAAG GTAGTGTCTA 


CGACCGACAA 


GGCTATCTGA 


CTTTGACCTT 


GCCAGTTTCT 


1560 


GAACACCATA TCATCACAGC 


CAAACTAATC 


GGTGCCTTTA 


TCTGGTCATT 


GATTAGCACG 


1620 


GCTGTATTGG CTCTAAGTGC 


TGTTATTATT 


CTGGCTTTAA 


CAGCTCCAGA 


ATGGATTCCT 


1680 


CTTTCTTATG TGATTACATT 


TGTAGAAACA 


CATCTCCCTC 


AGATCTTTCT 


TACAGGTATA 


1740 


TCCTTCCTAC TAAATACTAT TTCAGGAATC CTCTGCATCT ACCTGGCTAT TTCGATTGGA 


1800 


CAGCTTTTCA ATGAATACCG TACAGCACTC GCTGTTGCAG TGTAGATTGG TATCCAAATG 


1860 


GTCATTGGAT TTATTGAACT 


TTTCTTCAAT 


CTTAGTTCTA 


ATTTCTATGT 


CAATTCACTG 


1920 


GTAGGACTCA ATGACCATTT 


CTATATGGGA 


GCAGGTATAG 


CCATTGTTGA AGAACTCATA 


1980 


TTCATAGCTA TCTTTTATCT 


CGGAACCTAC 


TACATCTTGA 


GAAATAAGGT 


TAATTTGCTT 


2040 


TAAATAATTT TTACCTAGAT 


ATGTAACATA 


CTCATAGAAC 


AAAAGAGACC 


AGGCAAAAAG 


2100 


TCTTTAAAAT TAGAAAACGC 


ATAGTATCAG 


GTGTTGAATA 


TGTACTGCcC 


CCGAAAAGTT 


2160 


AGATTTTTTC TGTCTAACTT 


TTGGGGGCAG 


TTCATAAGAA 


CCTTGGTAAT 


ATGCGTTTTT 


2220 
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TGTGAGCTGA CTTATTTCCT TTCACTATAT CGCAAAATGA AATAAGAACG GAACGATGGG 2280 

ATTTTGGAAT TCAAATCAAT TTATAAGAAT GTTTTAGAAG TAATATTATC CTATTCCAGA 2340 

TTCAGTTCAC TATACAATTG AGTTTTCAAG CAACCTGTTT ACATAATGTG TACATAATTA 2400 

GGTTCGTGAT TCCACCCTTT TCACCTTTAA AAACCTCGCT TTCGCAAGGC TCTTCTATTT 2460 

ATAAGATAAG GCACGTTTAA AGGTTTTCCA AATCCCTAAA TCATCCGTTT GAAGAACGAG 2520 

ACTAGCATAC ATGCGTCCGA TAAATCCTGT TGCTACCACC GCAAAAATCA CTGTAATAGC 2580 

AAGTGAAATC CATGCTTCTG CTCCCCCCGC ATAGTCATTA ATCGTTCGAA ACGGCATAAA 2640 

GAAGGTCGAA ATAAAGGGAA TATAAGAACC AATCTTCAAG AGGAGATTGT CACCAGCTGC 2700 

ACCTAGAGCT GTCACTCCAA AAAAACCACC CATAATCAAA ATCATCAAAG GCGACAAGGC 2760 

TTTCCCTGAG TCCTCAGGAC GAGAAACCAT AGATCCTAGG AAGGCTGCCA AGACTACGTA 2820 

CATGAAAAGA CTGATCAAAA TAAAGAGCAA GGTATTCAGT GAGATAGCAT CTCCCAAGTG 2880 

ATCCAAAATA CCAGACTGAG CCAAGAATGG CAAATCTTTA AAGAGCAAAA CGGCAGCCAG 2940 

ACCACCTACA ACATAGATCC CAATATGCGT TAAAATCACT AGAAAC AGAG CCATCATCCG 3000 

CGCATAGAAA TAGTGACTTG CCCTTATGCT AGAAAAAACG ACTTCCATAA TTTTGGTGCC 3060 

TTTTTCACTG GCAACTTCCT GAGCTGTTAC ACCCGCATAG GTAATCAGAA TCATATAAAG 3120 

AAAGAATCCT AAGGCACCTG CTGCAATTGT TTGAATAAAC TTTTTATTTT CCTTGGCTTC 3180 

ATCAATCTTT TCTGTGAATT GAATTGTCTG CGCTAAGCGT TTTTCCTGCT CTTGAGACAA 3240 

GGAAGCAGTT GAACGATTAA GCTGATTTTG CAGTTCATTG AGTGTACCTG TAACCTCAAA 3300 

TTTAATTCCA TTTTCAAGCG ATGTTTCGCC ATGATAAACT GCCTTTAGAA CACTATCTTC 3360 

TTGATCAATG GTCAAATAAC CTTTTAATTT TTCTTCTTTA ATTGCTTCTT TGGCACTTGC 3420 

TTCGTCTTTA TAGTCGAAGT TAACACCATT TACATTCTTC AGTCCTTCTG CTACAGATGG 3480 

CACTGTTGTC ACTACTGCCA CTTTATTATT TTTAGCCATA GAAGAACCTT GGAGATGCCC 3540 

AATTCCTACA GAGATTCCTA AAAAGAGGAA CGGCGAAATC ACCATAAAGA AGAAACTCCA 3600 

TGACTCGACA TGTCGAAGAT AGGTTTCCTT GATTACAACC CACATATTTC TCATACTTCC 3660 

ACTCCTGATT CTAGTTTAAA GATTTCATCG ATAGTTGGCG CTTGTTGGTC AAATGTTCCG 3720 

ATATATTGAC CTTGAGTCAA GATTGAGAAG AGTTCCCTTC CAGCGCTCTC ATCCTCCAAA 3780 

ATCAATTTCC AACTGCCTTG TTTGGTCAAG CTCACCTGTT TGACATGAGG AAGATTTTCC 3840 

AATTCTTCCT TGCTTCGTTC ACTTGAAACA AAGAGACGCG TTTTCCCGTA TTGATTGCGG 39O0 

ACATCCTGAA CTGGTCCGTG CAAGACCACA CGGCCATCTC GGATCATCAG AATATCGTCA 3960 

CAAAGTTCCT CAACATTGGT CATGACATGG TCAGAAAAGA TAATGGTTGT CCGCGCTCTT 4020 
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TTTCCTGAAA AATGACTTGT TTGAGCAATT CTGTATTAAC TGGGTCCAAT CCACTAAAAG 4080 

GCTCATCCAA GATAATCAGG TCTGGTTCAT GAATCAGAGT AATAATGAGC TGAATCTTCT 4140 

GCTGATTTCC TTTTGACAGA CTCTTGATTT TATCTGTCAG CTTTCCTTTC ACTTCCAACC 4200 

TCTTCATCCA TTGAGGGAGT TTTTCTTTGA CTTCTTTGGC ATCCATGCCT TTTAGAGTCG 4260 

CCAAGTAGCG AACTTGTTCA AGAACTGTCA ATTTAGGCAT GAGATGCGTT CTTCAGGCAG 4320 

ATAACCAATC CGAGCATAGG TCTCCTGACG AATATCCTGA CCATCCAGAC CGATTTCTCC 4380 

CTGATATTCT AGGAATTTCA AAATACTATG GAAAATCGTT GTTTTTCCAG CACCATTTTT 4440 

TCCGACTAGT CCCAAAATAC GACCTGGTCG CGCTTGAAAG TCAATACCAA ACAAAACTTG 4500 

CTTGGATCCA AAACTTTTCT CTAGACTTCT TACTTCTAGC ATCTTTCACC TCCGAAATTT 4560 

CTTGCACTCA TTATACTCCT TTTTGATAGC CTTTACAATG TTTTTTGTCC ATTTTTAGAA 4620 

GACTATTGCT GTGTAAAATA TGGCCTGGAG CACTTTTATA CTCAATGAAA ATCAAAGAGC 4680 

AAACTAGGAA GCTAGCCGTA GACTGCTCAA AGTACAGCTT TGAGGTTGCA GATAAAACTG 4740 

ACGAAGTCgA CTCAAAACAC TGTTTTGAGG TTGTGGATAG AACTGACGAA kCrTAaCTAT 4800 

ATCTACGGCA AGGCGAAcTG ACGTGGTTTG AAGAGATTTT CGAAGAGTAT TAGTGATAAA 4860 

TCCATTATAC AGCAGCAAAC TTAATTTATA CCTTCCGCTC CTCAACTGTC TATTTTTAAT 4920 

CCTGAATTGT TATTTGAGTA ACTCCTTTTT CCTCGTAAAG TTTTCTTCCT CTAAAACTTC 4980 

TGGAAAAAGG CTAATAGTTT CAGACAACAT TTTTATAAGA AACAAGTTCA TCTGTGATTT 5040 

CAAGAAGGAG TAATCCTTTA TCTACTAATG GACGGAACAG AATTCAACCG CTTGTCCGAT 5100 

ATGTTTTCTA AGGATTATAT AGTAAAATGA AATAAGAACA GGACAAATTG ATCAGGACAG 5160 

TCAAATTGAT TTCTAACAAT GTTTTAGAAG TAGATGTATA CTATTCTAGT TTCAATCTGC 5220 

TATATCTATT ATGCACACCC CTATAGGATC TAATGAAAAT CACAACAGGC TCATTCATAG 5280 

ATGGTTACCT AAGCCTAAGG GAACTAAGAA AACGACTACC AAGGAAGTCG CATTCATCGA 5340 

AAAGTAGATT AACAACTATC CTAAAAAATG CTTGAACTAC AAGTCCCCCA GAGAAGACTT 5400 

CTGGATGACT AACTTGAACT TGAAATTTAG CAATAATTAA TTCACTATCT AACTATATTT 5460 

AGTAATTATT TCAGAACTGA TTAATATTAA AATTAACTAA CAATTCAAAG GATTCATACT 5520 

AGCCATAAAT TACGTCCATC AGAGAGAGAC TCTTACTACT TTTAGATTTT AGTCTTTCTA 5580 

GCTTCAGAAT ACATCTAAAC TTTAGGGAAA ATGACTATTC GAAAGCGCGA ATGCCTCAAA 5640 

ATTATCTCAG ATAAGCTATT CGAAACTTAG AATGCTTTTA AATTTATGGA ATTGCGATTA 5700 

TTCGAAACCT AGAATGCATA TAACCTTTAG TTGACAGACC TATTCTAAGT CTCGAAGGGC 5760 
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TATTTACTTT 


CTATTCCTTA 


TCAAAAAAGA 


CTCATTCCCC 


CTTTCTCCTC 


PAAAATATHP 

v^\AAA 1 nluVl 




TATAGTAGAA 


ATATACTATC 


TATGAGGAGT 


TTACATC3TCA 
1 1 n^n 1 \j 1 


PAGflATAA AP 

V, AV3UA 1 AAAL- 


AAA ITV^ TV A a r*r* 
AAA IvjAAAuL 


5880 


TGTTTCTCCC 


CTTCTGCAGC 


GAGTTATCAA 


1 a 1 ^» 1 1^0 


ATTfJTPPJ^TYl 


nryrMTVpr' a r* 

LrLrVjl IVyOVjALr 


5940 


TTTGATTTTC 


TGTATTTGGG 


V* 1 1 Al v~*VVJ VJ V- 


1 vuvin 1111 a 


PAATPPA Af2P 


AAALLL IL. 1 C 


6000 


TGCCTTTATC 


CAGCAC5GCAR 


1 V_ lUUUU 


TPPAPPTPTP 


TTTATCTTTT 


TACAGATTTT 


6060 


ACAGACTGTC 


V» 1 V— V_V_ 1 A 1 V-A 


ttpp A anr 1 cc 

i 1 L.V-AvjLtLjLjL, 


PTTPaPPTPT* 
L- 1 luALL 1 LLt 


Lt ItjvjL, i\)L>(jVj 


TCTTTATCTA 


6120 


CGGGCACATC 


ATPflfifi apt a 
Al L.OLtAjAI_ i A 


TPTAPA APT A 
1L1ALAAL 1 A 


1 A iUwln i\- 


uTGATTrGGCT 


GTGCCATTAT 


6180 


P^'HI^M^ATPT A 




AL-LtLjALjL. 1 vjL. 


CTTTGTCCAG 


TCTGTCGTCA 


GCAAGCGCAC 


6240 




1 A*- A i \_\jrAV_ 1 


L7L7L. 1 ALiA 1 AA 




TTTGACCGCT 


TCTTTATTTT 


6300 




Tf2P. P PP & TT A 


L7V.L>L-ALtL. 1 VjA 


CTTTCTCTGT 


ATGCTGGCTG 


CCCTGACCAA 


6360 




AALtL.L>L. 1ALA 


»Pf TV /"V> JV »TI/*» TVI** 

TGaCCATCAT 


CATTCTGACC 


AAACCCTTTA 


CCCTCGTGGT 


6420 


1 1 A 1 ALL 1 AL 


Wj 1 v, IXiftCCT 


ATATTATTGA 


CTTTTTCTGG 


CAAATGCTTT 


GACACGTAAA 


6480 


AAATCf RTTT 




GTGGATTTTT 


AAAGCGTAGA 


TTAACTATAG 


CTTGATACTA 


6540 


m\ i a i al. ill 


fiTTllW A A A 
LjVj 1 A i XjKj AAA 


1 L, A 1 LtL-A 1 A 1 


TTTTCGATAG 


TGAGGCGAGG 


ACTTACCTAG 


6600 


pptttp pp rr* 

V— L. 1 1 ILLuLL 


Vj l\a A 1 AuAAA 


LALL 1 VjAAA 1 


CTAATGGTTT 


CAGGTATTCG 


GAAACTTTGA 


6660 


a a\£ i w i L 


TP A A APTTT A 
1 Lnnnu 1 i 1 A 


LtAj 1 AIajuAA 1 


*T!fiirfV» 7\ 11 7v TV tv 
1 1 IajAAuAAA 


GTCGCTACCG 


TCCGTAATCA 


6720 


v. i i aavjuaaa 


fi/IPTP A A A A A 
uuL 1 LAnAAn 


TATTGTTTTC 


iv tv » r* 71 n Jv iv 
AALLaLaAaA 


TCCGTTTGGT 


TTCCCAAGCG 


6780 


GATT^PTV^TfiP 


llinl 11 iuA 


AACTTCTTTT 


uLAAlsAAL-AA 


AGTTCCCAAG 


TGTGGCAGAA 


6840 


^— ^— 111 1 V7 




Crime A Pf2AT A 


m t\ /^m/~> Tv car' 71 
i Au Iw^L-LrL-A 


LA 1 CTGGTAC 


TGGTAGGTAA 


6900 


CCATTAAGAA 


GAGATftTAAA 

AV3A 1 \J 1 AAA 


TTTPTP A Pf2f2 


A P A r t f > r ,r TTT > A 
ALALvilj i LLA 


f*r* ?l t* it rrvfnw 
uLA X A lo 1 l\j 


TTGAGCCATG 


6960 


ACCCCTCCAC 


CAAAGACAAT 


CACGTPTCRR 


PrtfiAAAfTPPA 


PTfJTP/lPATT 
v.l«i L.L7L.A 1 1 


AACCGGAGCT 


7020 


TGAGCGATAT 


AGTAGGCTTG 


AACATCCCAA 


ACAGGGTTGT 


TGAGTTCAAT 


AGTTTCCCCA 


7080 


CGTACACCTG 


TACGAGCTTC 


CAAACTTGGA 


CCAGCTGCAT 


AACCTTCTAG 


ACATCCCTTA 


7140 


TGGAAAGGAC 


AAACACCCTT 


AAACTCTTTT 


TCAATATCCA 


TTGGGTGTCT 


AGCAACATAA 


7200 


TAATGACCCA 


TTTCAGGGTG 


ACCCACACCA 


CCGATAAACT 


CACCACGTTG 


GATGACGCCT 


7260 


GCACCGATAC 


CTGTACCGAT 


TGTGTAGTAA 


ACCAAGTTTT 


CGATACGACC 


ACCAGCATTG 


7320 


TTACGGGCAA 


CCATTTCACC 


GTAAGCAGAG 


CTGTTTACGT 


CTGTTGTGAA 


GTACATTGGC 


7380 


ACGTTTAGGG 


CGCGACGAAG 


GGCACCAAGC 


AAGTCTACAT 


TTGCCCAGTT 


TGGTTTTGGA 


7440 


GTCGTCGTGA 


TAAAGCCATA 


AGTTTTTGAG 


TTTTTGTCAA 


TATCAATCGG 


CCCAAATGAA 


7500 


CCAACTGCAA 


GACCAGCAAG 


GTTATCGAAT 


TTTGAGAAGA ACTCAATGGT 


TTTATCGATT 


7560 



WO 98/18931 



PCT/US97/19588 



323 



GTTTCGATTG 


GAGTTGTTGT 


TGGAAATTGT 


GTrrm'Ci'A 


CAACGTTAAA 


GTTTTCATCA 


7620 


CCGACAGCAC AGACAAACTT 


TGTACCGCCC 


GCTTCCAAGC 


TTCCATATAA 


TTTTGTCATG 


7680 


ATAAACCTCT 


TGTTTTTATT 


TTCTTTATTA 


TAGCATACTT 


CGAAAGTCTA AATGTCTCTA 


7740 


TTTTTTAGAT 


TTTCCTCTGT 


AAATCTTACT 


ATCTAATAAA 


AACGAACAAA 


CATGTCATTT 


7800 


GTTCGTTTTC 


ACATTAGAGA 


GGATTGATTA 


GATTTTCACT 


TCGATCACAG 


CATCCCCCTT 


7860 


AGCAACTGAA 


CCTGTTGCGA 


CTGGAGCTAC 


TGAAGCGTAG 


TCACCTGTAT 


TTGTAACGAT 


7920 


AACCATTGTT 


GTATCATCAA 


GTCCAGCTGC 


AGCGATTTTG 


TTTGAGTCAA 


ATGTTCCAAG 


7980 


AACATCGCCA 


GCTTTCACCT 


TATTACCTTG 


AGCAACTTTT 


GTTTCAAAAC 


CGTCACCGTT 


8040 


CATAGATACA 


GTATCAATAC 


CAACATGAAT 


CAAAACTTCA 


GCACCATTTC 


TTGTTTTCAA 


8100 


ACCAAAAGCG 


TGCCCTGTTG 


GAAAGGCAAT 


TGAAACTTCA 


GCATCAGCTG 


GTGCATAGAC 


8160 


CACGCCTTGG 


CTTGGTTTCA 


CAACGATACC 


TTGTCCCATA 


GCTCCACTTG 


AGAAGACTGG 


8220 


GTCATTGACA 


TCAGCAAGAG 


GGACAACATC 


ACCGACGATA 


GGAGTTACAA 


GTGTTTCATT 


8280 


TTGAAGAGCT 


GCTGGCGCAA 


CTTCTTGTTT 


TTCTTCAGCC 


ACTTCAGCTC 


GTTTTGCAGC 


8340 


TGCAGTTGCG 


TCTACTTCAT 


CTTCGTAACG 


AAACATGTAA 


GTAAGAGCAA 


AACCAAGGGC 


8400 


AAATGATACA 


GCTACCATAA 


GAAGGTATTG 


TGGAAGTTGT 


GCGTTACCAA 


CATAAAGCAT 


8460 


TGTACCAGGG ATGATGGTGA 


TACCATTACC 


AGTACCAGCA 


AGTCCAAGGA TAGAAGCCAA 


8520 


TCCACCACCG 


ATTGCACCAG 


CAATCAATGA 


AAGGAAGAAT 


GGTTTACGGA 


AGCGCAAGTT 


8580 


CACCCCGAAG 


ATAGCAGGCT 


CTGTAATACC 


TAGGAAGGCA 


GAAAGAGCAG 


CCGGGAAAGG 


8640 


AAGTGTTTTC 


AGTTTTGGAT 


TTTTTGTTTT 


AACACCAACC 


GCAACAGTAG 


CAGCACCTTG 


8700 


AGCTGTCATA GCAGCTGTGA 


TGATAGCGTT 


GAATGGGTTA 


GCATGGTCAG 


CAGCAAGTAA 


8760 


TTGCACTTCA AGCAAGTTGA 


AGATGTGGTG 


CACACCTGAC 


ACGACGATCA 


ATTGGTGAAC 


8820 


GCCACCAATC AAGAAACCAC 


CAAGACCAAA 


TGGCATGCTA 


AGAATGGGTT 


TTGTAGCAAT 


8880 


AAGGATGTAG 


TTTTCAACAA 


CGTGGAAAAC 


TGGTCCAATG 


AGAAAGAGTC 


CAAGGATAGA 


8940 


CATGACCAAA 


AGTGTCACGA 


ATGGTGTTAC 


CAAGAGGTCA 


ATGACATCTG 


GAACAACTTG 


9000 


CGGACAGCTT 


TTTCAAATTT 


AGCTCCGACA 


ACCCCGATGA 


TGAAGGCTGG 


AAGAACGGAA 


9060 


CCTTGCAAAC 


CAACAACAGG 


GATGAAACCA 


AAGAAGTTCA 


TCGCTGTTAC 


TTCACCACCT 


9120 


TGAGCAACTG CCCAAGCGTT 


TGGAAGTGAG 


CCAGAGACAA 


GCATCATACC 


AAGAACGATA 


9180 


CCAACGGCAG 


GATTTCCACC 


AAATACAGGG 


AAGGTTGAGG 


ACAGAACCAA ACCTGGCAAG 


9240 


ATGATGAAGG 


CTGTATCTGT 


CAAGATTTGT 


GTGTAAGTTG 


CAAAGTGACG 


TGGAAGTGGG 


9300 
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ATTTCAAGAG CGTTGAAAAG ACCACGCACA CCCATGAAGA GACCTGTCGC TACGATAACT 9360 

GGGATGATTG GAACGAAAAC ATCACCAAAA GTACGGATAG CACGTTGGAA CCAGTTCCCT 9420 

TGTTTAGCAA CTTCTGCTTT CATGTCATCC TTAGATGATG TTGGTAATCC AAGTACAACA 9480 

ACTTCATCGT ACATTTTGTT AACTGTACCT GTACCAAAGA TAATTTGGTA TTGCCCTGAG 9540 

TTAAAGAAAG CACCTTGAAC TTTTTCCAAG TTCTCAATCA CTTCTTTATT GATTTTCTCT 9600 

TCATCTTTGA CCATGACACG TAGACGAGTC GCACAGTGGG CAACACTATT GACATTTTCA 9660 

CGTCCGCCCA AGGCATCGAT GACTTTTTTT GCAATTTCCT GATTGTTCAT TTGCAAAAAT 9720 

CTCCTTATAT AACATTTTGT TCTTGTTTGA AAGCGATTTT ATTCGCCGG 9769 



(2) INFORMATION FOR SEQ ID NO: 31: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3149 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



CGCTTGAGTG 


CTAATTCATA 


GTTCTATTGT 


ATCACTTGGT 


CAGAAATAAT 


CAAGAAAAAA 


60 


GTCTGACTTT 


CTCAAGATAA 


AAAGCCTGAG 


ACCAACTCAG 


ACTTTTTAAT 


TCTTAAAATG 


120 


GCAATTCTTC 


CTCTTCCAAG 


ACCAAATCTG 


CCAAATCTTG 


GCCTGCATTA 


TTTTCACGCA 


180 


TAGCACGTTG 


GGCACGACTT 


TCCAAGAGTT 


GGAATCCTGT 


GACAAGTACT 


TCGGTCACGT 


240 


AGTTCATTTG 


GCCATTTTTC 


TCAAAGCGAC 


GGGTACGCAA 


TTCTCCATCA 


ACGGAAATGA 


300 


GACTACCTTT 


GGTTGCGTAC 


TTGCCAAAGT 


TTCTGCTAGT 


CTGCCCCATA 


GGACCATATT 


360 


GACAAAATCA 


GCTTCACGTT 


CACCGTTTTG 


GTCTTTGTAA 


CGACGGTTCA 


CAGCGATAGT 


420 


TGCTCGCGCT 


ACCGACTTGT 


CATTGTTGGT 


TTTGTGCAAT 


TCTGGTGTAG 


ACGTTAAACG 


480 


TCCAATCAAG 


ATAACTTTAT 


TATACATATT 


TTCTTCCTCC 


TACTTATCTA 


TTCGTAGGAA 


540 


ATCAAAAAAA 


GTTACAGAAA 


TTTGTAACTT 


TTCGAGAAAA 


TTTTTTATTT TTTATGAACC 


600 


ATGAAACCTG 


TCGCCTGTTG 


ATTGGCCATA 


ATGGTCATAT 


CTGTAATCTG 


AACACGACGA 


660 


GGTTGACTAG 


TCACATAGAC 


TACTGTATCT 


GCAATATCCT 


GAGCTTGCAA 


AGCTTCTATT 


720 


CCTTGGTAAA 


CGGACGCAGC 


TCGTTCTTTA 


TCACCATGAA 


AACGCACTGT 


AGAAAAATCT 


780 


GTTTCGACAA 


TTCCAGGCTG 


AATGGTCGTC 


ACCTTGATAT 


CCGTTGCGAT 


GGTATCAATT 


840 


CGCAGTCCAT 


CTGAAAAGGT 


CTTAACTGCC 


GCCTTGGTGG 


CTGAGTAAAC 


AGCTGCACCA 


900 


GCATAGGCAT 


AAATTCCTGC 


GGTTGACCCC 


ATATTGATAA 


TATGACCTTG 


ATTGGCTTTT 


960 
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ACCATTGCTG GCAAGAAACA GCGAGTGACT GCCATCAAAC CTTTGACATT GGTATCCAAC 1020 

ATGGTCAGCA TATCCAACTC TTCATAGTCT TGATAGGGAG CTAAGCCAAG AGCCAGTCCT 1080 

GCGTTATTGA CCAGGATGTC AATCTGACCT ATCGTTTCTA AAATATCAGA GCAGACAGTG 1140 

TTTACCATTG TCATATCCGT GACATCTAGG AGAAAAGTCC AAACTGTTTG ATTTGGAAAA 1200 

GTTTCTGCAA ACTCCGCCTT AAGAGCTTCT AGTGTGTCTA TCCGTCGTCC TGTTAGAAGG 1260 

ACATCCTCAC CCTGCTCCAG ATAAGCACGC GCAATCGCTT CACCGATTCC TGATGTCGCT 1320 

CCTGTAATCA CAACATTTTT TGCCATCTTA TTTCCTTCTA GCTGGTCTAT CAGATATTAA 1380 

CAACTTCTTA GGCAGTCCAG TGTTTCGCTG GGTCGAACGG TGTTCCGAGA ACTTGGTGTT 1440 

CTGATAATTC AAGCACCCCA CGTTTTTGTG GAGCATTTGG CAGATGCAAT TCACGAGGAG 1500 

TGCACATCAT ACCAAAACTC TTTTCACCAC GAAGTTCACC TGGGAAAATG AGATTOCCTT 1560 

TTGGCATCAT AGCTCCAGGA AGCGCGACAA TGGTTTTCAA CCCCACACGC GCATTGGGAG 1620 

CTCCTGCAAC GATTTGTACA GTCTTATCAG TTGCGACTGG AACTTGGCAG ATGTTGAGGT 1680 

GGTCACTATC TGGATGGGCT ACCATCTCAA CAATTTCACC TACAACAAAC TTAGGTTCCT 1740 

TATCATTAAC AATTTCTTCT GTAAAACCTT CCGCCTGCAA CTCTTGGTTC AAACGAGCGA 1800 

CTTGCTCATC TGTCAAAAAG ACTTGACCGC GCTCTGGAAT TTCAAATAAA CTTGAAACTT I860 

CGAAAATATT CCAAGCCACT GTTTCCCCAT TATCTTTGAG AAAAACACGG GCTACCTTGC 1920 

CTTTGCGCTC CACATCCAGT TTGGCATCTC CGCTATTTTT CACGATGACC ATAAGGACAT 1980 

CACCGACATG TTGTTTATTA TATGTAAAAA TCATTGTTTC CTTTTTCTCC TATTTCAGTC 2040 

CTGCTAAAAA GTCATTGATT TGTTGCTTGC TTTTACGGTC GCGATTGACA AAACGACCGA 2100 

TTTCCTTGTC CTTTTCTAGA ACAACAAGGC TAGGAATTCC GTAAAGATGG CAGAGTTTGG 2160 

CCAAATCCAT ATACTGATCT CGGTCCATTC GAATAAAGGT GAACTCTGGA TTGGTCTCCT 2220 

CAATCTCTGG TAAGGCAGGA TAAATATAAC GACAATCGCT ACACCAGTCT GCCACAAAAA 2280 

TGAAGACCTT CTTGCCCGCT TTTTCCACTA AAGATGCTAA TTCTTCTAAA GTTGCTGGCT 2340 

GTATCATAAG ACTTCCTCCT CATAGACTAG GTCTTCATTT TCATAGACAA AGGTATAATG 2400 

ACGGCCATCC TCAAAAATGA CGCCACCAAC CAAGCTCTCC AGACTGCTTT CGTAAACTTG 2460 

AACATAAAGG GTCGCAATTT CCCCCATGTC GGAAAAATGG TCTCGCACAA TCTCTGTCAA 2520 

CTCTTCCTGA GTCTTCATGA GCTTACGGTC ATCTGCAACT TTTTTCGTAG CAAGAGCAAG 2580 

GCTTCCGATA CCTAGCAGAG CCAAGCCTGC CATCCACATT TTTTTAGCTT TGATACGATT 2640 

CATTTTAACA CAAAAAAGGC TTCAGGACAA ATGAGGAAGC AGCAGAAAAG CAAGTAAAAA 2700 
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GCCTCTTCCT TTAAGGAAAA GGACTTCTTA TACTCAATGA AAATCAAAGA CCAAACTAGG 2760 

AAGCTAGCCG CAGGCTGCTC AAAGCACTGC TTTGAGGTTG TAGATAGAAC TGACGAgTCa 2820 

CTCAAAACAC TGTTTTGAGG TTGTGGATGA AGCTGACGTG GTTTGAAGAG ATTTTCGAAG 2880 

AGTATTATTC TTATTGCCAG GCACCTAAGT TGCCAACGTA GTAACTATCA GGTGTGTAGG 2940 

TATTGCGAGC ATCTTACCTG ATGAAGCCAG ATAATACTAC TTGCCATTGT CTTTGACCCA 3000 

ATCATTCGCA ATCATGGAAC CAGAAGAACT TACATAATAC CATTCTCCCT TGTCATAAAC 3060 

CCAAGTACTG ACTTTCATGG TTCCTGAGCA ATTAAAGGCA AAAAAACTGT CCAATAACAT 3120 

TCGTTTTTTA AAAGCATTTG ACACTACAT 3 14 9 
(2) INFORMATION FOR SEQ ID NO: 32: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10240 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



<xi) SEQUENCE DESCRIPTION : SEQ ID NO: 32: 



CCAAAAATTC 


AACCTTTAAG 


GGGAGTCCAG 


AGAGACTCAC 


AAGGTGTCAG 


ATAAAAGAAT 


60 


GGTGCAATTT 


TCTAGAGGAG 


ACTTTTTGAG 


TGTGCTCTCT 


TGTGTTGTAC 


GATTTTAACT 


120 


GAGGCCTTGC 


ACTAGCAAGG 


TCTTTTCTTT 


ATCTGGTCCC 


CTTAAAATTT 


AAGGAGGAAA 


180 


AGTTATGAAT 


CCCACATGTA 


AGAAGCGTTT 


GGGTGTCATT 


CGGTTGGAAA 


CCATGAAGGT 


240 


GGTTGCACAA 


GAGGAAATCG 


CGCCACAATC 


TTTGAATTAG 


TCCTAGAAGG 


AGAAATGGTT 


300 


GAAGCCATGC 


GAGCAGGCCA 


ATTTCTTCAT 


CTGCGTGTAC 


CGGACGATGC 


CCATCTCTTA 


360 


CGTCGTCCTA 


TTTCAATTTC 


GTCTATTGAC AAGGCAAACA AGCAGTGTCA 


CCTCATTTAT 


420 


CGGATTGACG 


GAGCTGGGAC 


TGCAATTTTT TCAACCTTAA GTCAGGGAGA 


CACTCTTGAT 


480 


GTGATGGGGC 


CTCAGGGAAA 


TGGTTTTGAC 


TTGTCTGACC 


TTGATGAGCA 


GAATCAGGTT 


540 


CTCCTTGTTG 


GTGGTGGGAT 


TGGTGTTCCA 


CCCTTGCTTG 


AGGTGGCCAA 


GGAATTGCAT 


600 


GAACGTGGAG 


TGAAAGTAGT 


GACAGTCCTC 


GGTTTTGCTA 


ATAAGGATGC 


TGTTATTTTG 


660 


AAAACGGAAT 


TGGCTCAGTA 


TGGTCAGGTC 


TTTGTAACGA 


CAGATGATGG 


TTCTTATGGC 


720 


ATCAAGGGAA 


ATGTTTCCGT 


TGTTATCAAT 


GATTTAGACA 


GTCAGTTTGA 


TGCTGTTTAC 


780 


TCGTGTGGGG 


CTCCAGGAAT 


GATGAAGTAT 


ATCAATCAAA 


CCTTTGATGA 


TCACCCAAGA 


840 


GCCTATTTAT 


CTCTGGAATC 


TCGTATGGCT 


TGTGGGATGG 


GAGCTTGCTA 


TGCCTGTGTT 


900 


CTAAAAGTAC 


CAGAAAACGA 


GACGGTCAGC 


CAACGCGTCT 


GTGAAGATGG 


TCCTGTTTTC 


960 
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CGCACAGGAA 


CAGTTGTATT ATAAGGAGAA AATTATGACT ACAAATCGAT 


TACAAGTTTC 


1020 


TCTACCTGGT 


TTGGATTTGA 


AAAATCCGAT 


TATTCCAGCA 


TCAGGCTGTT 


TTGGCTTTGG 


1080 


ACAAGAGTAT 


GCCAAGTACT 


ATGATTTAGA CCTTTTAGGT 


TCTATTATGA 


TCAAGGCGAC 


1140 


AACCCTTGAA 


CCACGTTTTG 


GGAATCCAAC 


TCCAAGAGTG 


GCAGAGACGC 


CTGCTGGTAT 


1200 


GCTCAATGCA 


ATTGGCTTGC 


AAAATCCTGG 


TTTAGAGGTT 


GTTTTGGCTG 


AAAAGCTACG 


1260 


TTGGCTGGAA 


AGAGAATATC 


CAAATCTTCC 


TATTATTGCC 


AATGTAGCTG 


GTTTTTCAAA 


1320 


ACAAGAGTAT 


GCAGCTGTTT 


CTCATGGGAT 


TTCCAAGGCA 


ACTAATGTAA 


AAGCTATCGA 


1360 


GCTCAATATT 


TCTTGTCCCA 


ATGTTGACCA 


CTGTAATCAT 


GGACTTTTGA 


TTGGTCAAGA 


1440 


TCCAGATTTG 


GCTTATGATG 


TGGTGAAAGC 


AGCTGTGGAA 


GCGTCAGAAG 


TGGCAGTTTA 


1500 


TGTCAAATTA 


ACCCCGAGTG 


TGACCGATAT 


CGTTACTGTC 


GCAAAAGCTG 


CAGAAGATGC 


1560 


GGGAGCAAGT 


GGCTTGACCA 


TGATCAATAG 


TCTGGTTGGA 


ATGCGCTTTG 


ACCTCAAAAC 


1620 


TAGAAAACCA 


ATCTTGGCCA 


ATGGAACAGG 


TGGAATGTCT 


GGTCCAGCAG 


TCTTTCCAGT 


1680 


AGCCCTCAAA 


CTCATCCGCC 


AAGTTGCCCA 


AACAACAGAG 


CTGCCTATCA 


TTGGAATGGG 


1740 


AGGAGTGGAT 


TCGGCTGAAG 


CTGCCCTAGA AATGTATGTG 


GCTGGGGCAT 


CTGCTATCGG 


1800 


AGTTGGAACA 


GCTAACTTTA 


CCAATCCTTA TGCCTGCCCT 


GACATCATCG 


AAAATTTACC 


1860 


AAAAGTCATG 


GATAAATACG 


GTATTAGCAG 


TCTGGAAGAA 


CTCCGTCAGG 


AAGTAAAAGA 


1920 


GTCTCTGAGG 


TAAACTGCAA 


TCAATCTGTT 


CTTGATTTTT 


TATTAGTTTG 


TAATATGAAT 


1980 


TTAGGAGAAT 


TTTGGTACAA 


TAAAATAAAT 


AAGAACAGAG 


GAAGAAGGTT 


AATGAAGAAA 


2040 


GTAAGATTTA 


TTTTTTTAGC 


TCTGCTATTT 


TTCTTAGCTA 


GTCCAGAGGG 


TGCAATGGCT 


2100 


AGTGATGGTA 


CTTGGCAAGG 


AAAACAGTAT 


CTGAAAGAAG 


ATGGCAGTCA 


AGCAGCAAAT 


2160 


GAGTGGGTTT 


TTGATACTCA 


TTATCAATCT 


TGGTTCTATA TAAAAGCAGA 


TGCTAACTAT 


2220 


GCTGAAAATG 


AATGGCTAAA 


GCAAGGTGAG 


GACTATTTTT 


ACGTGAAATC 


TGGTGGCTAT 


2280 


ATGGCCAAAT 


CAGAATGGGT 


AGAAGACAAG 


GGAGCCTTTT 


ATTATCTTGA 


CCAAGATGGA 


2340 


AAGATGAAAA 


GAAATGCTTG 


GGTAGGAACT 


TCCTATGTTG 


GTGCAACAGG 


TGCCAAAGTA 


2400 


ATAGAAGACT 


GGGTCTATGA 


TTCTCAATAC 


GATGCTTGGT 


TTTATATCAA 


AGCAGATGGA 


2460 


CAGCACGCAG 


AGAAAGAATG 


GCTCCAAATT 


AAAGGGAAGG 


ACTATTATTT 


CAAATCCGGT 


2520 


GGTTATCTAC 


TGACAAGTCA 


GTGGATTAAT 


CAAGCTTATG 


TGAATGCTAG 


TGGTGCCAAA 


2580 


GTACAGCAAG 


GTTGGCTTTT 


TGACAAACAA 


TACCAATCTT 


GGTTTTACAT 


CAAAGAAAAT 


2640 


GGAAACTATG 


CTGATAAAGA 


ATGGATTTTC 


GAGAATGGTC 


ACTATTATTA 


TCTAAAATCC 


2700 
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GGTGGyTACA TGGCAGCCAA TGAATGGATT TGGGATAAGG AATCTTGGTT TTATCTCAAA 27 60 

TyTGATGGGA AAATrGCTGA AAAAGAATGG GTCTACGATT CTCATAGTCA AGCTTGGTAC 2820 
TACTTCAAAT CCGGTGGTTA CATGACAGCC AATGAATGGA TTTGGGATAA GGAATCTTGG 2880 
TTTTACCTCA AATCTGATGG GAAAATAGCT GAAAAAGAAT GGGTCTACGA TTCTCATAGT 2940 
CAAGCTTGGT ACTACTTCAA ATCTGGTGGC TACATGGCGA AAAATGAGAC AGTAGATGGT 3000 
TATCAGCTTG GAAGCGATGG TAAATGGCTT GGAGGAAAAA CTACAAATGA AAATGCTGCT 3060 
TACTATCAAG TAGTGCCTGT TACAGCCAAT GTTTATGATT CAGATGGTGA AAAGCTTTCC 3120 
TATATATCGC AAGGTAGTGT CGTATGGCTA GATAAGGATA GAAAAAGTGA TGACAAGCGC 3180 
TTGGCTATTA CTATTTCTGG TTTGTCAGGC TATATGAAAA CAGAAGATTT ACAAGCGCTA 3240 

GATGCTAGTA AGGACTTTAT CCCTTATTAT GAGAGTGATG GCCACCGTTT TTATCACTAT 3300 

GTGGCTCAGA ATGCTAGTAT CCCAGTAGCT TCTCATCTTT CTGATATGGA AGTAGGCAAG 33 60 

AAATATTATT CGGCAGATGG CCTGCATTTT GATGGTTTTA AGCTTGAGAA TCCCTTCCTT 3420 

TTCAAAGATT TAACAGAGGC TACAAACTAC AGTGCTGAAG AATTGGATAA GGTATTTAGT 3480 

TTGCTAAACA TTAACAATAG CCTTTTGGAG AACAAGGGCG CTACTTTTAA GGAAGCCGAA 3540 

GAACATTACC ATATCAATGC TCTTTATCTC CTTGCCCATA GTGCCCTAGA AAGTAACTGG 3600 

GGAAGAAGTA AAATTGCCAA AGATAAGAAT AATTTCTTTG GCATTACAGC CTATGATACG 3660 

ACCCCTTACC TTTCTGCTAA GACATTTGAT GATGTGGATA AGGGAATTTT AGGTGCAACC 3720 

AAGTGGATTA AGGAAAATTA TATCGATAGG GGAAGAACTT TCCTTGGAAA CAAGGCTTCT 3780 

GGTATGAATG TGGAATATGC TTCAGACCCT TATTGGGGCG AAAAAATTGG TAGTGTGATG 3840 

ATGAAAATCA ATGAGAAGCT AGGTGGCAAA GATTAGTACT ATAAGTGAAT ATGATTTGAG 3900 

TGAATAGTAA GTTAAAAATC CTGATTTCAA GTAAAATCAG GATTTTTTCA TGGATGCAAT 3960 

TTTTTTGGAG TCTGGTGTGA CGCGGAGGGT CTTTTGTCCT GTGTAAGTGA CAAAGCCGGG 4020 

TTTTCCACCA GTTGGTTTAT TGAGTTTTTT GACTTCAATC ATATCTACCT GCACCAGATT 4080 

CGACAGGCGC CCTTGAGAGA AGTAGGCAGC TAACTCTGCT GCGTCTGTCT TGACTGCATG 4140 

AGATGGGTCA AGATTTCCTG AGATGACAAC ATGGCTTCCA GGAATGTCCT TAGCATGGAA 4200 

CCAAAGTTCC TCCTTGCGGG CCATTTTAAA GGTCAATTCC TCATTTTGAA GATTGTTTCG 4260 

TCCGACATAG ATGATGGTTT TGCCATCGCT TGCTAGATAT TGTTCTAGTT TTTTGCGTTT 4320 

CTGGATTTTC TCCCGTTGTC TTCTGCGGAT AAAACCTGTT TGAATCAATT CTTCACGGAT 4380 

TTCAGCGATT TCTTCCAGTC CAGCTTGGTT GAGGACGGTT TCTACACTTT CCAGATAGAG 4440 

AATAGTGGCT TTGGTTTCTT CAATCAAATC AGTCAAGTAT TTGACAGCTT CTTTGAGTTT 4500 



WO 98/18931 



PCT/US97/19588 



329 



CTGATACCGT 


TTAAAATAGC 


GTTGGGCATT 


CTGGTTGGGA 


GTCAGAGCCT 


TATCAAGCGC 


4560 


AATCATGATA GGTTGGTTGG 


TATAGTAGTT 


GTCTAGGATA 


ACCTGGTCTT 


GGTCGTTAGG 


4620 


CACTTGGTGG 


AGGAAGGTTG TCAGCAATTC TCCTTTTTGA CGAAATTCTT CAGCGTTGTC 


4680 


TGTCGCCAGT 


AACTCTTTTT 


CCTGTTTTTT 


GAGTTTGTGT 


CGGTTTTTCT 


GAAGTTCATT 


4740 


TTCAACACGA 


CGAATCAGTT 


CACTGGCCTG 


CTGTTTGACG 


CGGTCGCGCT 


CAGCCTTATC 


4800 


CTTATAGTAG GTGTCCAACA AATCAGAAAG ATTTGCAAAA GGCTCTCCCA CCTGATTTGC 


4860 


AAAAGGAACT 


GGACTGAAGG 


AAGTCTCAGT 


CAAGCATGGC 


TTGGTTTCTT GATTGAAAAA 


4920 


ATTTCGGAAA GCGGAAAGTT 


TTTCACTAAC 


CAGTATCCTT 


TCCAATTCAT 


TTGCCGTATC 


4980 


GCGTCCCAGA 


CCTTGAAAGA 


GGCTTTGAAG 


ATTTTTTGCT 


GTTAGTTCTT 


GGGTTTGCAG 


5040 


GATTTCAAAG 


AGCTTTTCAT 


CCTTGATAGT 


AAAAGGATTG 


AGAGATTTTG 


TACTTGGCGG 


5100 


AGCGATATAG 


GTCGATCCTG 


GAAGTAAGGT 


GCGGTAGCTA 


TTTTGTGAAA 


AGCCGACGTG 


5160 


TTTGATAACT 


TCGAGGATTT 


TATGACTGCT 


TTTATCGACC 


AGTAGAATAT 


TACTGTGTTT 


5220 


CCCCATAATT 


TCGATAATCA 


AGGTAGCCTG 


GATATGGTCT 


CCAATCTCGT 


TTTTATTGGA 


5280 


AACTGTAATT 


TCCACAATAC 


GGTCATTTTC 


CACTTGCTCA ATCGACTCAA TCAGGGCCCC 


5340 


CTGCAAATAC 


TTTCTCAAAA 


CCATGATAAA 


GGTAGAAGGT 


TGAGCTGGAT 


TTTCAAAAGT 


5400 


CGTTTGGGTC 


AGCTGAATGC 


GTCCAAAAAC 


TGGATGGGCA 


GAAAGGAGCA 


GGCGATGGCT 


5460 


TTGGCGATTG 


CTGCGGATTT 


GCAAGACCAA 


CTCTTGTTCA 


AAAGGCTGAT 


TGATTTTCTG 


5520 


GATGCGACCA 


TTCACTAATT 


CGCTTCGCAA TTCCTCAACT ATGTGGTGTA AAAAAAATCC 


5580 


GTCAAATGAC 


ATCGTTCTCT 


CCTTGTGATT 


GTATTCCATA 


GTATTATATC 


AAAAAGGTAG 


5640 


AATAAAATCA 


TGGAAATGTG 


GTATAATAAA 


GCCAAGTAAA 


GAGAAACGAG 


AAGCACATGT 


5700 


ATATTGAAAT 


GGTAGATGAA 


ACTGGTCAAG 


TTTCAAAAGA 


AATGTTGCAA 


CAAACCCAAG 


5760 


AAATTTTGGA 


ATTTGCAGCC 


CAAAAATTAG 


GAAAAGAAGA 


CAAGGAGATG 


GCAGTCACTT 


5820 


TTGTGACCAA 


TGAGCGTAGT 


CATGAACTTA 


ATCTGGAGTA 


CCGTAACACC 


GACCGTCCGA 


5880 


CAGATGTCAT 


CAGCCTTGAG 


TATAAACCAG 


AATTGGAAAT 


TGCCTTTGAC 


GAAGAGGATT 


5940 


TGCTTGAAAA 


TTCAGAATTG 


GCAGAGATGA 


TGTCTGAGTT 


TGATGCCTAT 


ATTGGGGAAT 


6000 


TGTTCATCTC 


TATCGATAAG 


GCTCATGAGC 


AGGCCGAAGA 


ATATGGTCAC 


AGCTTTGAGC 


6060 


GTGAGATGGG 


CTTCTTGGCA 


GTACACGGCT 


TTTTACATAT 


TAACGGCTAT 


GATCACTACA 


6120 


CTCCGGAAGA AGAAGCGGAG ATGTTCGGTT TACAAGAAGA AATTTTGACA GCCTATGGAG 


6180 


TCACAAGACA 


ATAAACGAAA 


ATGGAAAAAT 


CGTGACTTGA TATCCAGTTT 


AGAATTTGCT 


6240 
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TTGACAGGTA TTTTTACTGC TATCAAGGAA GAACGCAATA TGCGAAAACA CGCAGTGACG 6300 

GCTCTAGTGG TCATCCTTGC AGGTTTTGTT TTTCAGGTGT CACGAATCGA ATGGCTCTTT 6360 

CTCCTATTGA GTATTTTCTT GGTAGTAGCC TTTGAGATTA TCAACTCTGC TATTGAAAAT 6420 

GTGGTGGATT TGGCCAGTCA CTATCACTTT TCCATGCTGG CTAAAAATGC CAAGGATATG 6480 

GCGGCCGGCG CGGTATTAGT GGTTTCTCTT TTCGCAGCCT TAACAGGCGC ATTGATTTTT 6540 

CTCCCACGAA TCTGGGATTT ATTATTTTAA ACAGTAAGAG GAAATTATGA CTTTTAAATC 6600 

AGGCTTTGTA GCCATTTTAG GACGTCCCAA TGTTGGGAAG TCAACCTTTT TAAATCACGT 6660 

TATGGGGCAA AAGATTGCCA TCATGAGTGA CAAGGCGCAG ACAACGCGCA ATAAAATCAT 6720 

GGGAATTTAC ACGACTGATA AGGAGCAAAT TGTCTTTATC GACACACCAG GGATTCACAA 6780 

GCCTAAAACA GCTCTCGGAG ATTTCATGGT TGAGTCTGCC TACAGTACCC TTCGCGAAGT 6840 

GGACACTGTT CTTTTCATGG TGCCTGCTGA TGAAGCGCGT GGTAAGGGGG ACGATATGAT 6900 

TATCGAGCGT CTCAAGGCTG CCAAGGTTCC TGTGATTTTG GTGGTGAATA AAATCGATAA 6960 

GGTCCATCCA GACCAGCTCT TGTCTCAGAT TGATGACTTC CGTAATCAAA TGGACTTTAA 7020 

GGAAATTGTT CCAATCTCAG CCCTTCAGGG AAATAACGTG TCTCGTCTAG TGGATATTTT 7080 

GAGTGAAAAT CTGGATGAAG GTTTCCAATA TTTCCCGTCT GATCAAATCA CAGACCATCC 7140 

AGAACGTTTC TTGGTTTCAG AAATGGTTCG CGAGAAAGTC TTGCACCTAA CTCGTGAAGA 7200 

GATTCCGCAT TCTGTAGCAG TAGTTGTTGA CTCTATGAAA CGAGACGAAG AGACAGACAA 7260 

GGTTCACATC CGTGCAACCA TCATGGTCGA GCGCGATAGC CAAAAAGGGA TTATCATCGG 7320 

TAAAGGTGGC GCTATGCTTA AGAAAATCGG TAGCATGGCC CGTCGTGATA TCGAACTCAT 7380 

GCTAGGAGAC AAGGTCTTCC TAGAAACCTG GGTCAAGGTC AAGAAAAACT GGCGCGATAA 7440 

AAAGCTAGAT TTGGCTGACT TTGGCTATAA TGAAAGAGAA TACTAAGTAG AGGTAGGCTC 7500 

ATGCCTGCTT CTTGTTTTTA CAGAAGGAGG ACTTATGCCT GAATTACCTG AGGTTGAAAC 7560 

CGTTTGTCGT GGCTTAGAAA AATTGATTAT AGGAAAGAAG ATTTCGAGTA TAGAAATTCG 7620 

CTACCCCAAG ATGATTAAGA CGGATTTGGA AGAGTTTCAA AGGGAATTGC CTAGTCAGAT 7680 

TATCGAGTCA ATGGGACGTC GTGGAAAATA TTTGCTTTTT TATCTGACAG ACAAGGTCTT 7740 

GATTTCCCAT TTGCGGATGG AGGGCAAGTA TTTTTACTAT CCAGACCAAG GACCTGAACG 7800 

CAAGCATGCC CATGTTTTCT TTCATTTTGA AGATGGTGGC ACGCTTGTTT ATGAGGATGT 7860 

TCGCAAGTTT GGAACCATGG AACTCTTGGT GCCTGACCTT TTAGACGTCT ACTTTATTTC 7920 

TAAAAAATTA GGTCCTGAAC CAAGCGAACA AGACTTTGAT TTACAGGTCT TTCAATCTGC 7980 

CCTTGCCAAG TCCAAAAAGC CTATCAAATC CCATCTCCTA GACCAGACCT TGGTAGCTGG 8040 
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ACTTGGCAAT 


ATCTATGTGG 


ATGAGGTTCT 


CTGGCGAGCT 


CAGGTTCATC 


CAGCTAGACC 


8100 


TTCCCAGACT 


TTGACAGCAG 


AAGAAGCGAC 


TGCCATTCAT 


GACCAGACCA 


TTGCTGTTTT 


8160 


GGGCCAGGCT 


GTTGAAAAAG 


GTGGCTCCAC 


CATTCGGACT 


TATACCAATG 


CCTTTGGGGA 


8220 


AGATGGAAGC 


ATGCAGGACT 


TTCATCAGGT 


CTATGATAAG 


ACTGGTCAAG 


AATGTGTACG 


■ 8280 


CTGTGGTACC 


ATCATTGAGA 


AAATTCAACT 


AGGCGGACGT 


GGAACCCACT 


TTTGTCCAAA 


8340 


CTGTCAAAGG 


AGGGACTGAT 


GGGAAAAATC 


ATCGGAATCA 


CTGGGGGAAT 


TGCCTCTGGT 


8400 


AAGTCAACTG 


TGACAAATTT 


TCTAAGACAG 


CAAGGCTTTC 


AAGTAGTGGA 


TGCCGACGCA 


8460 


GTCGTCCACC 


AACTACAGAA 


ACCTGGTGGT 


CGTCTGTTTG 


AGGCTGTAGT 


ACAGCACTTT 


8520 


GGGCAAGAAA 


TCATTCTTGA 


AAACGGAGAA 


CTCAATCGCC 


CTCTCCTAGC 


TAGTCTCATC 


8580 


TTTTCAAATC 


CTGATGAACG 


AGAATGGTCT 


AAGCAAATTC 


AAGGGGAGAT 


TATCCGTGAG 


8640 


GAACTGGCTA 


CTTTGAGAGA 


ACAGTTGGCT 


CAGACAGAAG 


AGATTTTCTT 


CATGGATATT 


8700 


CCCCTACTTT 


TTGAGCAGGA 


CTACAGCGAT 


TGGTTTGCTG 


AGACTTGGTT 


GGTCTATGTG 


8760 


GACCGAGATG 


CCCAAGTGGA 


ACGCTTAATG 


AAAAGGGACC 


AGTTGTCCAA 


AGATGAAGCT 


8820 


GAGTCTCGTC 


TGGCAGCCCA 


GTGGCCTTTA 


GAAAAAAAGA 


AAGATTTGGC 


CAGCCAGGTT 


8880 


CTTGATAATA 


ATGGCAATCA 


GAACCAGCTT 


CTTAATCAAG 


TGCATATCGT 


TCTTGAGGGA 


8940 


GGTAGGCAAG 


ATGACAGAGA 


TTAACTGGAA 


GGATAATCTG 


CGCATTGCCT 


GGTTTGGTAA 


9000 


TTTTCTGACA 


GGAGCCAGTA 


TTTCTTTGGT 


TGTACCTTTT 


ATGCCCATCT 


TCGTGGAAAA 


9060 


TCTAGGTGTA 


GGGAGTCAGC t 


AAGTCGCTTT 


TTATGCAGGC 


TTAGCAATTT 


CTGTCTCTGC 


9120 


TATTTCCGCG 


GCGCTCTTTT 


CTCCTATTTG 


GGGTATTCTT 


GCTGACAAAT 


ACGGCCGAAA 


9180 


ACCCATGATG 


ATTCGGGCAG 


GTCTTGCTAT 


GACTATCACT 


ATGGGAGGCT 


TGGCCTTTGT 


9240 


CCCAAATATC 


TATTGGTTAA 


TCTTTCTTCG 


TTTACTAAAC 


GGTGTATTTG 


CAGGTTTTGT 


9300 


TCCTAATGCA 


ACGGCACTGA 


TAGCCAGTCA 


GGTTCCAAAG 


GAGAAATCAG 


GCTCTGCCTT 


9360 


AGGTACTTTG 


TCTACAGGCG 


TAGTTGCAGG 


TACTCTAACT 


GGTCCCTTTA 


TTGGTGGCTT 


9420 


TATCGCAGAA 


TTATTTGGCA 


TTCGTACAGT 


TTTCTTACTG 


GTTGGTAGTT 


TTCTATTTTT 


9480 


AGCTGCTATT 


TTGACTATTT 


GCTTTATCAA 


GGAAGATTTT 


CAACCAGTAG 


CCAAGGAAAA 


9540 


GGCTATTCCA ACAAAGGAAT TATTTACCTC GGTTAAATAT CCCTATCTTT TGCTCAATCT 


9600 


CTTTTTAACC AGTTTTGTCA TCCAATTTTC AGCTCAATCG ATTGGCCCTA TTTTGGCTCT 


9660 


TTATGTACGC 


GACTTAGGGC 


AGACAGAGAA 


TCTTCTTTTT 


GTCTCTGGTT 


TGATTGTGTG 


9720 


CAGTATGGGC 


TTTTCCAGCA 


TGATGAGTGC 


AGGAGTCATG 


GGCAAGCTAG 


GTGACAAGGT 


9780 
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GGGCAATCAT CGTCTCTTGG TTGTCGCCCA GTTTTATTCA GTCATCATCT ATCTCCTCTG 9840 

TGCCAATGCC TCTAGCCCCC TTCAACTAGG ACTCTATCGT TTCCTCTTTG GATTGGGAAC 9900 

CGGTGCCTTG ATTCCCGGGG TTAATGCCCT ACTCAGCAAA ATGACTCCCA AAGCCGGCAT 9960 

TTCGAGGGTC TTTGCCTTCA ATCAGGTATT CTTTTATCTG GGAGGTGTTG TTGGTCCCAT 10020 

GGCAGGTTCT GCAGTAGCAG GTCAATTTGG CTACCATGCT GTCTTTTATG CGACAAGCCT 10080 

TTGTGTTGCC TTTAGTTGTC TCTTTAACCT GATTCAATTT CGAACATTAT TAAAAGTAAA 10140 

GGAAATCTAG TGCGAGTAAA AATCAATCTC AAATGCTCCT CTTGTGGCAG TATCAATTAC 10200 

CTAACCAGTA AAAATTCAAA AACCCATCCA GACAgATTGA 10240 



(2 J INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13206 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

CGCTTTATCG TGGACGTGGT CAAGCCGAGA ATTTCATCAA GGAGATGAAG GAGGGATTTT 60 

TTGGCGATAA AACGGATAGT TCAACCTTAA TCAAAAACGA AGTTCGTATG ATGATGAGCT 120 

GTATCGCCTA CAATCTCTAT CTTTTTCTCA AACATCTAGC TGGAGGTGAC TTCCAAACTT 180 

TAACAATCAA ACGCTTCCGC CATCTTTTTC TTCACGTGGT GGGAAAATGT GTTCGAACAG 240 

GACGCAAGCA GCTCCTCAAA TTGTCTAGTC TCTATGCCTA TTCCGAATTG TTTTCAGCAC 300 

TTTATTCTAG GATTAGAAAA GTCAACCTGA ATCTTCCTGT TCCTTATGAA CCACCTAGAA 3 60 

GAAAAGCGTC GTTAATGATG CATTAAAGAA CAGTCGAGAT GAAAAAATCG TGTGACGCAC 420 

CAAGGGAGGA GTCTGCCCTT TTGAGGAAAT CTAGCGAGGA AAAACGATAC TGGAACAGCA 480 

GAAAGTAAAA CTGACCTCAT GAGGAGGAAG AAAGTGGCTC ATGAGGTCAG GGGTTTTG^A 540 

AGTTACATCT AGTTGAGAGA GGTATGAATG ATTTGGGATT AATCATTTCT TGTTTTAAAT 600 

CAGGAGAATA GTAACGATTT TTTCCTTTTT TGACGAACTC TATTCCGTAA CGATCAATCA 660 

ATTTAATCAT GTACCTAATA TTAGAATTGT TTATCCCAAA TTTATTTGAA AGCTTCTCTA 720 

AGCTATATCC TTGTTTTCTA AGTTCATAGA TCTGAACTTT ATCATCATAA GTTAGTTTCA 780 

TAATAAAAAC ACCCCAAAAG TTAGATTTTT TCTGTCTAAC TTTTGGGGGG CAGTTCATTC 840 

AACACCTGAT ACTATGCGTT TTTCTTATTT GAAATACTTT TTACTCAACC TCTTTATACT 900 

CAATGAAAAT CAAAGTGCAA ACTAGAAAGC TAGCCTCAGG CTGCTCAAAA CAGTGTTTTG 960 
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AGGTTGCAGA TGGAAGCTGA CGTGGTTTGA AGAGATTTTC GAAGAGTATT ACTTAATCTT 1020 

CTTGATACTT TGACTAAGAA TAAATCCTAC AATCATCCCT ACCATATTTT GCATAAAATT 1080 

CGGTAGAATT TCTGGGAGGG CTGCTGCCCA GCCATTCATC AAAGCAGAAC CCAAGGCGTA 1140 

GCCTCCTACC ATGGCAATAG TTGCTAAAAT AAGGCCTAAC CACTGACTTT TTCCTTTAAA • 1200 

TCCTGCGAAA AATCCCTGCA AGCCATGGTT GACCAAGCTA AAGAACATCC ACTGAGGGTA 1260 

GCCTGATAAG AGGTCAATCA AGAAACTTGC TAGTCCTCCG ACTACCGCTC CTTCACGACT 1320 

ACCAAAGTAA AAGGCCGCAA AGAAGACACC AGCATCTAAA AGAGTTAGAA TTCCTGTAGG 1380 

TGTTGGGATT TTTAAGAAAT AACCTAGAAC CACAGAAAGG GCGGTTAATA GGGATACAAG 1440 

GGCGATTTTA GTTGTTTTTG TTTGCTTCAT ATTGTCTTAC TCCATACTGA TCTGCTTGTG 1500 

CAATAGCACG ATAAACGAAA GCCTTAGAGC TTTCTACTGC TGGCAAAAGT TTATCACCTT 1560 

TAACCAGGTG ACTGGCAATG CTAGAGsCAA AGGTACAACs TGCACCAGCA TTTTGGCCTT 1620 

GGATAACTGG ATTTTCTAGG ATAGTAAAGG TCTGTCCATC ATAAAAGACA TCCACAGCCT 1680 

TGTCCTGACT AAGACGATTG CCTCCCTTGA TAATGACTGt GGCGCTCCTA AATCATGCAA 1740 

TTTCTGCGCT GCAGTTTTCA TGTCTTCCAA GGTTTTAATT TCCTGACCGG ATAATAATTC 1800 

TGCTTCTGGG AGATTAGGCG TAATCACACT GACATAAGGG AAAAAGCGAA TCAACTCTTG 1860 

GCAGAGCTCA CTGACAGCTA CATCATGCGT TTCCTTGCAG ACCAAGACAG GATCCAACAC 1920 

CACAGGTACT CCTGGGCGTT GTTTGATAAA GTCCAAGGCC TTCTCAGCCA CGCTGACAGT 1980 

AGGGAGAAGA CCAATCTTAA TTCCCCCAAA TTCCACATCA CGCAAGCTAT CTAATTGATG 2040 

TTGAAAAATG GTATCATCAG TTGGAAAGAC TTCAAATCCT TTTTCTGTCA AGGCTGTCAA 2100 

ACAAGTCACT GCTACAAACC CATGCAAGCC GTTCAAGGTA TAGGTAGCCA AATCAGCTGA 2160 

CAGTCCACCA CCACTAAAAA TATCATTTCC AGAAAGTGCT AAAATACGAT TATTCTTCAT 2220 

AACGAATCTC CTTTAAATAC AAACCATTTG GTGCTGCAGT GGGACCTGCA AGTTGCCTGT 2280 

CCTTCTTCTC CAAGATGAGA TCAATCTGCT CTACTGGCAT GCGGTTGTTA CCGATTTTGA 2340 

GAAGAGTCCC CACCATATTG CGAATCTGTT TATACAAGAA ACCATTTCCT GAAAAGGTAA 2400 

AGGTCAAAAA TTGTCCTGTC TCATCGACTA TTAAACTAGC TTCTGTGATG GTGCGAACCT 2460 

TATCCTCTAC ACTAGTCCCA GAGGCTGTAA AACCGGTAAA ATCATGGGTT CCCTCTAGCT 2520 

TTTTGATTGC AATCTGCATT CGTTCCACAT CGAGTGGGTA GGGAAAGTGG GTGGCATAGT 2580 

GACGGCGCAT CGGATTTTTG GGACGTCCTC TATCCACAGT AAACTCATAG GTCTTGCTAT 2640 

GCTTGGCATA ACGGCAATGA AAATCATCTG CCACAAGCTC AATCGAAATC ACATCAATAT 2700 
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CTTCAGGAGA 


CTGGGTATCC 


AAGGCAAAAC 


GGAGTTTCTC 


CTCATCCATC 


TGATAAGGCA 


2760 


GGTCAAAATG 


AATCACCTGT 


CCCAGGGCAT 


GAACCCCACT 


ATCTGTCCTA 


CCAGCACCGT 


2820 


GAACAGTAAT 


GGCTTGCCCT 


TTATTTAATC 


TGGTCAAGGT 


TTTTTCAATT 


TCTTCCTGAA 


2880 


CGCTACGCGC 


ATGAGGCTGG 


CGCTGAAAGC 


CAGCAAAGGC 


ATAACCATCA 


TAGGAAATAG 


2940 


TTGCTTTATA 


TCTCGTCATA 


GCCTCTATTT 


TATCAAGAAA 


TTAGTCTGTA 


AACAAGGACC 


3000 


TAAAACAAAT 


ATTGTATGGG 


TATAAAAATC 


TCATACTCTT 


CGAAAATCTC 


TTCAAACCAC 


3060 


GTCAGTTTCC 


ATCTGCAACC 


TCAACACACT 


ATTTTGAGCA 


ACCTGCGGCT 


owv X X X I* X /\ X 


3120 


AGTAGATTGA 


AATAAGATAT 


GAACAACTCT 


ATTAGGAAAG 


TCAAATTAAT 


TTTTAfiAAAT 


lion 


ATTTTAGCAG 


CTACAGCGTA 


CTATTCCAAA 


CTCAATCAAC 


x n x j. x x 


1L1 1 JuA Ail 


3240 


TCATTGAGTA 


TCAAAAGAAA 


AACTTAGGAA 


TCAATCCTAA 


GC TC**Pf , *I w Pf v P 

X ^ X 




3 300 


CATGACAAAG 


ATAGAGATTA 


CAATCAACCA 


ACCTCCTAAG 


ATACTAAAKA 


V^AAUA I 


3360 


ATTGTGAGTT 


AGTAAGCCAA 


TTGCACCTAG 


AACGAATGGG 


GTcnTAAArai 

x X /UUI\J\9 


pTpfv a jv run 


3420 


ACAGCCTAAT 


ACAGCAAATG 


AAGTTGCTTG 


ATTGAGGAGT 


TT AfiflYlC A A 
X x f\oi* X 


x 1 t-Vj 1 1 UAliA 


3480 


GACAAGTTGA 


AAGACCGTCG 


TCAAGACTAC 


ACTATAGGCA 


AATCCAGCCA 


A AP A PTTPr 




TGCTACTACC 


ACCCACAAGG 


ATGAAGACAA 


GGCAATCACG 


ATTTGCPPPA 

*» 1 X -L vj v, \„ v r\ 


A ^ rr* a a a fir*r 


JoUU 


AATACCAGAC 


CAGAGGAGCA 


GTTTCTCTTT 


AAAGATAGAA 


ATCAAGAAAG 


AAAAAPfPAr* 


jDOU 


CCCAGCCACA 


ATCCCGATCA 


ACTGCATGAT 


ACTAAGAACA 


AAACTAGATA 






CCCCAATCCT 


CTTTCCACCA 


TCAAACTTGG 


AATACGGATG 


GTAATAGCTG 


TATTGGTAPA 

X tX X X UV X AU\ 




AACTACAACT 


GCCGCTTCGA 


TAGCTAAGGT 


AAAAATCAAG 


CCTTTCATTT 


CTCGARTTAA 


3840 


ACGACTTGCT 


TCCTTCGCTC 


TTTTCTTGAC 


TTCTTTCTTT 


GATTTTCCAT 


AAGGGACAAA 


3900 


GAGCAGATAA 


AGGGGCAGCA 


CCAAAAATCC 


AGCACTATAG 


GCTAGAAAGA 


TAGCTGTCCA 


3960 


ACCAAAGGCC 


AAPAArrviAP 


\-vj>u_v^rtjC (.An 


Uu 1 AA1 KjxAAjPi 


GAA.G CTCC AA 


CGACCTCTGC 


4020 


AGAAGCGCGT 


AGCCCTAACA 


TCTGAATTCG 


CCTTTTTCCT 


TGGTAGCGTT 


CACTGATAAT 


4080 


AGAAATGGCC 


TTGGCATTGA 


TCATCCCAAG 


ACCCAAACCA 


AAGAGAAGCC 


GTGTTCCAAA 


4140 


GACAAAGGGA 


TAGGCTTGGT 


ACCAGAAGGG 


AGCTGTACCG 


CTCAATGATA 


AAATCAGCAA 


4200 


GCCCAAACTA 


ATCTGTAAGC 


GCTCAGGAAA 


TATTTTTTCT 


AAGAAACCAT 


TTAGCAGTAA 


4260 


CATCATCATG 


ATTCCAAAGG 


AAGGCAAGCT 


CACCAAGAGC 


TCAATTTGTT 


CCTTAGAATA 


4320 


ACCCTGATAA 


TAGTCAAACA 


TGGCTGGTAG 


GGCACTCGAA 


ATGGAAAAGG 


AGGTAATCAA 


4380 


AACGAGGGAG 


AGAGCCAAAA 


TGCTGGCCCG 


TTCTAAAAAT 


TGTTTCATGA 


AATCTCTTTC 


4440 


TATATTTCTC 


TTAATCTTCT 


ACTTTTITGA 


TAGTTATCAA 


ATAAGCAAGA 


AAAGAAGAAG 


4500 
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CCTCATTGGT 


TTGTAGACTC 


CTTCTTAAAT TCGAAAATGA ATCCCTTGTA TCTTATACTC 


4560 


AATGAAAATC 


AAAGAGCAAA 


CTAGGAAGCT 


AGCCGCAGGT 


TGTTCAAAAC 


AGTGTTTTGA 


4620 


GGTTGCAGAT 


GGAAACTGAC 


GTGGTTTGAA 


GAGATTTTCG 


AAGAGTATTA 


GGATGACTTT . 


4680 


CTCTTGATTT 


GCTTGATAAA GTAGAAAATA AATCCTGCTA CCATATAGGC 


AACAAAGATA 


4740 


ATCAGACACC 


ACTTAAACAC 


AACATTCCAA 


CCCTTGTTCA 


CATTCAAAAA 


GAAGTAAGGG 


4800 


AAAGGATTAT 


CCTTGGCATT 


TGGAATATTG 


AGTTTTAGAA 


CCAAGCCATT 


AAAAAGAGCA 


4860 


AACATCATAT 


ACAGAAAGGG 


TAAAATGGTC 


CACACTGCTG 


GATCCCAAAT 


CTTGTATTGA 


4920 


CCCTGTTTGT 


CAAAAAAGAG 


GGTATCCGCT 


AAAAACCAGA 


TGGGAACGAT 


ATAGTGGCAA 


4980 


AGGAAATTTT 


CTAGGGTATA 


GAAATTAGTC 


GCAATGGGCG 


CCAAGAGGAA 


ATGGTAAATC 


5040 


ACACAGGTAA 


TCATGATACT 


CATGGTGACC 


CCACCTTTTA 


AGCGCAAGAG 


ACTTGGCCTT 


5100 


TGCCAATTTT 


CACCTACACG 


GCTCATAACC 


TTTAGAAGAT 


AAAGGGTAAA 


AATAGTTACC 


5160 


AAGAGGTTGG 


ACAGAACCGT 


GTAATAGAGA 


AGCATCCCAA 


AACCACCATG 


CTTAGTAATT 


5220 


TCAAGATAAA CTCCCGTAAA AGCCGCTAGA AACAAGAAGA TACGGCTATA AAATACAAGT 


5280 


TTATAGTGTT 


TTGACATGCT 


TAAATCTTCC 


TCACAAACTC 


TGATTTAAGT 


TTCATGGCAC 


5340 


CAAAACCATC 


AATCTTACAG 


TCGATATTGT 


GGTCGCCTTC 


TACGATGCGG 


ATATTTTTCA 


5400 


CGCGCGTCCC 


TTGTTTCAAA 


TCTTTTGGCG 


CACCTTTTAC 


TTTGAAGTCC 


TTGATGAGAG 


5460 


TTACTGTATC 


ACCATCAGCC 


AATTTATTTC 


CGTTGGCATC 


GATAGCGACA 


AGACCTTCTT 


5520 


CTACTTCTGC 


AACTTCAGGA 


GGATTCCACT 


CATGAGCACA 


CTCTGGGCAA 


ACCAGTAGGG 


5580 


CACCGTCTTC 


GTAGACATAC 


TCTGAGTTAC 


ATTTTGGACA 


ATTTGGTAAA 


TTGTTCATGG 


5640 


TTTCTCCTTA 


TCATCATTCA 


CTATTCTTTG 


AAAATCAAAA 


TTTCTCGAAC 


AGCAACTATT 


5700 


ATACCCTAAA 


ATCAGCATTT 


TGACAAATTT 


AGAAAAAAAC 


CGATATCAAT 


CTATCGGCTT 


5760 


TTCTACATTT 


ACATTCTTTT 


TTCAGCTTCT 


GCTTTGATTT 


TTTCAACTAC 


TTCTTGAATG 


5820 


TTCAAACCAG 


TTGTATCAAG 


GTAGACAGCA 


TCCTCTGCTT 


GTTTGAGAGG 


AGAAGTCTCA 


5880 


CGATGACTAT 


CCTTGTAGTC 


ACGCGCAGCA 


ATTTCCTTTT 


TTAGGGTTTC 


AAGGTCTGTT 


5940 


TCAATTCCCT 


TGGCAATATT 


TTCCTTGTAA 


CGACGCTCTG 


CTCTCTCATC 


AACAGAAGCT 


6000 


ACTAGGAAAA 


TTTTCAATTC 


TGCTTGTGGC 


AATACAACAG 


TTCCAATATC 


GCGACCATCC 


6060 


ATGACAATCC 


CGCCTTGCTG 


GGCAATTTCT 


TGTTGGAGAG 


AAACCAGTTT 


CTCACGCACT 


6120 


TGAGGAATTG 


CTGCAATAGC 


AGAAAGATGA 


TTGGTCACTT 


CATTTTGACG 


GATAGGATGG 


6180 


GTAATATCCA 


CATCTCCTAC 


AAAAACAAGC 


TGGTCTCCAG 


TTTCTGAACG 


TCCAAAGCTG 


6240 
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ATTGGATGCT 


GGTCCAACAA 


GGCTAGAAGG 


GCTTCGACTT 


CTTCAACTCC 


TAATTGGTTC 


6300 


TTAAGAGCCA 


TATAGGTCGC 


TGCACGATAC 


ATAGCTCCTG 


TATCAAGGTA 


GGTGAATCCA 


6360 


AAATCCTTAG 


CAATAATCTT 


TGCGACCGTA 


CTCTTACCGC 


TGGAAGCAGG 


ACCATCAATA 


6420 


GCAATTTGAA 


TTGTTTTCAT 


ATCGGCTCCT 


ATTTTATTTT 


TATAACATCA 


CCTGGATTAG 


6480 


CAAACCAAGA 


TCCTGTAGCC 


ATGTGCCCAG 


GATTCAAGGC 


CTCTAACTGA 


GCAATGGAGA 


6540 


TTCCTGCACG 


AGCGGCAATA 


GCTGCTTCCC 


CTTCTCCTGC 


GAGAACTTTA 


ATCGTTCCTT 


6600 


CAGGATTAGC 


AGCTTCTTCT 


GAACTACTAG 


AAGTAGATTC 


TGGCTCTGAA 


CTCTGCTCAG 


6660 


GCTGAGAACT 


ACTTGAAGAT 


GAGATTTGTA 


CTACACTGGC 


ATCAGAATCA 


TGAAAGCCTT 


6720 


TTAAGGCTGC 


TGTGCGATTA 


CTCCCCCCCG 


ATGATAGATA 


GATGAGAACG 


ATGAPPATPA 


6780 


CCACCACAAT 


TACAAAGAAA 


ATACTAGCTA 


GGATCGTCAA 


AATACGATTA 




6840 


CAGCCCCTCC 


GTGGTTTCGA 


TGCCGACGCT 


CTGCTCTTGA 


TTCTTCTTGA 


TPATAfiATAT 


6900 


CTTCTTGCCA 


CGGTTCTTTT 


GCCATACCTT 


ACTCCTTGTT 


TTTTTTTACT 


x a x w x x f\ x x r\ 


6960 


CAATATAAAT 


ATGAACATGA 


AAATCACACT 


TATACCTGAA 


CGATGTATCG 


CCTGTflGGf* f P 


7020 


TTGCCAAACT 


TATTCTGATT 


TATTTGATTA 


CCACGATAAT 


GGAATCGTGC 


VJl 1 1 1 


/ uou 


TGACCCTGAC 


CAACTGGAAA 


AAGAAATTTC 


TCCTAGTCAG 


GATATCTTAG 


AGGCTGTTAA 


71 Aft 


AAATTGCCCA 


ACTCGCGCCC 


TGATTGGAAA CCAGGAAGCC 


TAAATCAATG 


GCGATAATrY* 




ACTCCCTCTA 


GTTTAGCACA 


TTTCCATGTA 


AAATTATAGT 


CTTTTCACTT 


T ATTTTTT TP 

XnXX X 4 X X X Vw 


7260 


TGTAAAATCA 


GGAAGGTCAC 


TTTTTTCTTT 


GATAAGATAA 


AGTGGTCTTT 


1 

TTTTAGTCTG 


7320 


TAAATAAATC 


TTACTGATAT 


ACTTGCCGAG 


AATCCCAATG 


GTCAAGAGTT 


GAATGCCTCC 


7380 


AAGAAAGAGA 


ATAACAGCCA 


TCAGAGAGGT 


CCAACCAGAT 


GTCGGATTGC 


CCAAAATGAG 


7440 


GGTCCGAACC 


ACAACAAAAA 


AGGTC ATCAG 


CAGAGAAAGA 


AAACAAGATA 


GGAGACCAGC 


7500 


x n^./vinuuL x 




GAAAATCTGA 


AAAATTAATA 


ATCCCTTCAA 


TGGAGTAGAA 


7560 


AAAGAGTTGC 


CTAAAACTCC 


AACTTGTCTT 


GCCAGCCTGC 


CTTTCGACAT 


TTGGATAGTC 


7620 


CAAATAGTAG 


GTTTTGAAAC 


CCACCCAGGC 


GAAGAGCCCC 


TTTGAAAAAC 


GATTGGACTC 


7680 


GGTCAAGCTT 


AAAATGGCAT 


CGACTACAGA 


CCTTCTCATC 


ATACGAAAAT 


CACGGACACC 


7740 


CGACGGCAGA 


GCTACTGGGC 


TGATTTTTTG 


CATGAGGCGA 


TAAAAGAGAA 


CAGCACAGAA 


7800 


ACTGCGAAAG 


AAGGGTTCTC 


CCTCCCGACT 


AGTTCTCCGT 


GTCCCAACGC 


AGTCCAAGTC 


7860 


TACATTTTTG 


TCTAATACAT 


TTTTCATCTC 


AAACAACATA 


CTAGGAGGAT 


CTTGGAGGTC 


7920 


TGCATCCATC 


ACCACCACCA 


AATCTCCTGT 


CGCATATTGC AAGCCTGCAT AAAGGGCTGG 


7980 


TTCTTTGCCA 


AAATTTCGAG 


AGAAAGAAAT 


ATAATGGACT 


GCCGGATTTT 


GCTCCCGATA 


8040 
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GGCCTTTAAG AGTTCCAAGG TCCCATCACT TGATCCATCA TCGACAAAGA CATACTCGAT 8100 

TTCTGTTTCC AAATCTGGAA GTAAAGCTTC CAGAGCCTGA TAAAAAAGAG GAAGTACTTC 8160 

CTCTTCGTTT AAACAAGGGA CGATGATTGA AATCATCATC TTAGTCTTCA AATCCATTTG 8220 

GATGCTTGCT TTGCCAACGC CATGCGTCTT CACACATTTG GGTGATGTCG AGTTCTGCTT 8280 

CCCAACCGAG TTCTGCTTTA GCTTTTGCCG GGTCTGAGTA GCAGGCAGCG ATATCACCTG 8340 

GGCGACGTTC TACGATGCGG TAAGGAATAG GACGGCCCAC CGCTTTTTCC ATGTTTTGGA 8400 

TAATTTCAAG AACTGAGTAA CCTTTACCAG TTCCAAGGTT ATAAACGTTT AGTCCTGAAC 8460 

CTTTTTGGAT TTTTTTCAAA GCTGCAACGT GACCCTTAGC CAAATCGACA ACGTGGATAT 8520 

AGTCACGAAC ACCTGTTCCA TCTTCCGTAT CGTAATCGTC TCCAAACACT TGCACTTGCT 8580 

CTAATTTTCC AACGGCTACT TGAGTCACAT ATGGCAAGAG ATTGTTTGGA ATACCGTTTG 8640 

GATTTTCTCC CAAATCACCA CTCTCATGGG CTCCGATTGG GTTAAAGTAA CGAAGCAAGA 8700 

CAACATTCCA TTCTGAGTCT GCTTTGTAAA TATCAGTCAA AATTTGCTCT AGCATGAGCT 8760 

TAGTACGACC GTATGGGTTG GTCACTGAAA GTGGGAAATC TTCCAAGATG GGCACTGTGT 8820 

GCGGATCCCC GTAAACTGTC GCAGAAGAAC TGAAGATGAT GTTTTTACAG TTGTTTTCTT 8880 

CCATGGCTTT CAAAAGGCTG ACAGTTCCAG CGATATTGTT GTCATAGTAG GCAAGAGGGA 8940 

TACGTGTTGA TTCGCCAACA GCCTTCAAAC CAGCAAAGTG AATGACACCA GTCGGTTCTT 9000 

CCTGCTTGAA AATATCTCTG AGGGTATCTG TGTCACGAAT ATCTGCCTCA TAGAAAGGAA 9060 

TCTCAACTCC TGTGATTCCT TCAACAACTT CTAAACTCTT ACGATTGCTA TTGACAAGAT 9120 

TATCCACCAC AACAACTTGA TGACCTGCTT GGATCAATTC AATAACAGTG TGGGTTCCAA 9180 

TAAAACCGGC ACCACCAGTT ACCAAAATCT TTTCTTGCAT CTTTTTTCCT CGATTCTCAG 9240 

ATTATTTTTT CTTATTTTAC CATTTTTGAC AGGGAATGTC ATTTGCCATC CTAAACTACC 9300 

TGATAAAATT TCAGTAAAAT GCTTATACTC TTCGAAAATC CAATTCAAAC TACGTCAACG 9360 

TCGCCTTGCC ATGGGTATGG TTACTGACTT CGTCAGTTCT ATCCAGAACC TCAAAACAGT 9420 

GTTTTGAGCT GACTTCGTCA GTTCTATCCA CAACCTCAAA GCAGTGCTTT GAGTAACCCG 9480 

CGGCTAGTTT CCTAGTTTGT TCTTTGATTT TTATTGAGTA TTATTCGCTT TTTACTCGTT 9540 

TGACATAGTT TTCAATTGGG TAATTTAGAG GGTCCAAGGT CAACTCCTTG TCTTGGATCA 9600 

GTTGGGCTAG ATGGTAACCA ATGATAGGAC CAGTTGTGAG GCCTGATGAA CCTAGTCCAC 9660 

TGGCTGCATA GACACCAGTT AAGTCAGGGA CCTGCCCAAA GAAAGGAGAG AAATCACTGG 9720 

TGTAGGCACG GATTCCAACA CGCTCAGATT TTGAAGTAGC TTCAGCCAAA ATCAGATAGT 9780 
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GAGTCAAGGT 


GGCCTCCTCC 


ATTTGTTGGA GCAAGGTTTC 


ATCTACCGTC AAATCAAATC 


9840 


CCATGTCATT 


TTCGTGGGTA GCGCCTAAGG ATAATTTCCC ACCTGCAAAG GGAATCAAAT 


9900 


CCCACTCCCC 


TTCTGGCATG 


ACAACAGGGT 


AATCTTCCAT 


GTCTTGGGCA AGCTGATAAT 


9960 


CTCGTAGTTG 


TCCTTTTTGA GGACGGACAT CCACTTCATA ACCTAAAGGC TCTAACATGT 


10020 


CCCCCAACCA AGCTCCCGTC GCCAAAATAA CCTGCTCAAA CTCCTCTTCA CCAATCTGGT 


10080 




TAACGGTGTC 


AGAGTCACTT 


TTTCTTTGAC 


CAGCTTGACA TGACTGACTT 


10140 


L L.ACK.AAACG 


AGTCACTAAA 


AGTTGGCCAT 


CTACTCTCGC 


TCCACCAGAA GCATAGAGCA 


10200 


GGCGGTCAAA 


TCCCTGCAAA 


CCAGGGAATA 


ATTCATTAGC 


TGAGGCTTGG TTCAGAATGG 


10260 


CTAATTGCCC 


TATCAAGGGA 


GATTCTTCTC 


TGCGCTGGAG 


GGCCAGTTGA TAAAGTTCTT 


10320 


CCAAATTGGA 


TTCATCCTTT 


TTCAAGAGAA 


AGACTCCCGA 


ACGCTGGTAA AAGTCGATTT 


10380 


CTTGTCCTGA 


TTTCTCTAAA TCAGCTAATA AATCCACATA AAAATCAGCC CCCAAGCGCG 


10440 


CCATCTTGTA 


CCAGGCTTTA 


TTACGGCGTT 


TGGAAAACCA 


AGGACTGATA ATTCCTGCTG 


10500 


CGGCCTTGGT 


GGCTTGACCT 


TGCTCATGGT 


CAAAAACGGT 


CACCTCTAGG TCACTTTCTC 


10560 


TCGAGAGGTA 


GTAGGCAGCT 


GTTGCTCCCA 


CAATTCCTGC 


TCCAATAATG GCAACTTTTT 


10620 


TCATTGTCTT 


CACTTTCTAA 


CTAGATATGA 


TGGAAAGGAT 


TGGTTGATGC CTGACTAGGC 


10680 


AAGATATCAA 


TAGACCACCC 


CTTATCTTCC 


TTCCATTGAC 


TAAGAAGTGC TGCGATTTTT 


10740 


TCTACAAAAA 


TCACTTCGAT 


ATAGTGACCT 


GGGTCCAATG 


CAAGCAACCC ATCAGATAGC 


10800 


ATATCCTGAG 


CAGTATGGTA 


GTAGATATCA 


CCAGTGATAT 


AGACATCTGC CCCCTTTGCC 


10860 


AAAGCATCCT 


TATAGAAAGA 


CTGCCCGCTT 


CCAGCACAAA 


TTGCTACTCT TGAAATAGGC 


10920 


TTCTGCAAAT 


CATCCTCTTG 


ATAATGCAGC 


ATTCGAAGGC 


TATCTAGGTC AAAGACTTGC 


10980 


TTGACCTGTT GGGCCAATTC CCAAAATGTC TGAGGCTGAA TATTCCCAAT ACGTCCAATT 


11040 


CCACGTTCTG 


GACCTGTTTC 


CTGCAGATAA 


GTCGTCTCCT 


CGATTCCTAG CATCTGACAA 


11100 


AACCAGTCAT 


TGAGCCCATT 


TTCAACGATA 


TCAATATTGG 


TATGGCTGAC ATAAACTGCG 


11160 


ATATCATGCT 


TAATCAGGTC 


GATGTAAATC 


TGATTTTGCG 


GACGGCTGGC AAGCAAGTCC 


11220 


TTGATAGGAC 


GAAAGATAGG 


CGCGTGCTTG 


ACGATAATCA 


AGTCCACACC CTTTTCAATG 


11280 


GCCTCTGCCA 


CTGTCTCTTC 


ACGAATATCG 


AGGGCAACCA 


TGACCCTTTG GATACCCTTG 


11340 


TCTAAAGTGC 


CAATTTGCAG 


ACCACGGCTG 


TCTCCCTCCA 


TAGAAAATTC CTGAGGGCAA 


11400 


AAGGCTTCAT 


AAGCTTGGAT 


CACTTCACTT 


GCTAACATGG 


AGCACCTCCT TGATAGCTTG 


11460 


AATCTTATCT 


ACTAGAACTT 


GACGTTCTTC 


CAGATTTTTT 


TCTGGGATTT GTCCGAGGGC 


11520 


GAACTCTAGC 


TTCTCAGCTT 


CTTTTTGCCA 


TTTTTGGACA 


AATACTGGAC TGACTTCTTT 


11580 



WO 98/18931 



PCT/US97/19588 



339 

GGACAAGAAG GGACCAAAGC GAACATCACT GGCTGATAGC TTCATTTGTC CTGCTTCCAC 11640 

CACCAAAATC TCATAAAACT TTCCAGCTTC TTCTAAGATG CTTTCTGCTA CAATCTGGAA 11700 

TCCATGATCC TGTAGCCAGA TACGCAAGTC GTCTTCACGA TTATTGGGCT GGAGGATCAA * 11760 

ACGCTCTACA TTAGCTAACT TCCCCAAACC TTCTTCTAAA ATCCTAGCAA TCAAACGACC 11820 

ACCCATGCCA GCAATGGTAA TGACAGACAC TTGGTCAGTC TCTTCAAAAG CTGCCAAGCC 11880 

ATTGGCTAAA CGGACTTGGA TTTTCTCCTT TAGGCCGTGA GCCTCAACAT TTTTAACCGC 11940 

AGACTGATAG GGACCTTCCA GCACCTCACC TGCAATAGCG CTTTTGATTT GGCCTCTCTC 12000 

AACCAACTCG ATAGGCAGAT AAGCATGGTC ACTTCCCACA TCTAGTAAAA TAGCCCCCTG 12060 

TGACACAAAG GAAGCTACCA ATTCTAATCT CTTTGAAATC ATCTTCTCTC ACTTTCCAAA 12120 

ACTCTATTAC CTCTTATTAT ACCACATTTC AATCTTCAAC TTCCCAGTAA TATAAGCACC 12180 

TCTGGCGAAA GAAGTTTCAA TGTCCTAAAG TAATAAGTGA ATCCAATTGA AAGATTTTAA 12240 

ACAATTTGCA AAAATGTCAA AAAATAAAAA ATAAACAGTT TATTCAGAAA ATTCTTGACA 12300 

TATAAAAACA CATGGTAGAA TATAATTAGA AAGTTAGAAA AAATAAAAGT TTGACTAAAA 12360 

TTTGTATTTG AAGGTGGTGT TCAGATAAGA AATTTAGTCA GACGAACCAC GAATTTGCTC 12420 

TATGCTTTCT GGAATTTATC ATAACAGGAG GAT ACAGTCA TGGAACAAAC ATTGTTTGAA 12480 

TTAGAACTAC TTCCAGAGGA AGATATCATT GTCACAGGTC TCCCTAAGTA TTGTTCTTTT 12540 

ACTTGTTTAA TTACAGGTCG CTAGTTATAT TTTATATAAA ATAAGTAGCT TTACTTACGG 12600 

AATAGGCTAG TGCTGTGTCT CTAGCCTATT TTAATAATTA GGAGTTTGTT ATGGATTTAT 12660 

TAGAGAAAGA ATGTTTAAAA TGTGATAAAA ATTTCCAACA GGGTGATATT TGGAATTACT 12720 

ATTATTTATC AGATAAGATG CCTGCACAAG GGTGGAAAAT ACACATAAGC TCCCAAATAA 12780 

AAGACGCTGT AAATATTTTT AAGATTGTGT ATAAACTATC CCAACTAAAT AATTGTAGCT 12B40 

TTAAAGTTGT TAAAAATTTA GAGGAATTAA AAAAAATTAA TTCCCCTAGG GAAATGAGGC 12900 

CTACTGCTAA CAAATTTATA ACTCTATATC CTAAGTCAGA ATCTGAAGCT AAGAGTATGA 12960 

TTTGTAATCT TACGAATAGA CTGTCAGAAT TTAAGGCTCC AAAAATACTA TCTGACTATC 13020 

AATGTGGAAT GCATTCTCCA GTTCATTATA GATATGGGGC TTTTTTAAAA AAACAAGCTT 13080 

ATGATGAAAA AAATAAAAAA GTCATCTATT TATTGCTAGA TGAAAAAAGG AAGAACTATG 13140 

TAGAAGATAA GAGACAAAAT TTCCCTAGTC TTCCTAGCTG GAAAATGGAT TTATTTTCAG 13200 

AAGAAG 13206 
(2) INFORMATION FOR SEQ ID NO: 34: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13104 base pairs 
<B> TYPE: nucleic acid 
(C) STRANDEDNESS : double 
<D) TOPOLOGY : linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



CCGGATCCAG 


CGAAAAATAT 


GCTCTTTGAT 


GCTGTAAGTG 


GTCAAAAAGA 


TGCTAAAACA 


60 


GCTGCTAACG 


ATGCTGTAAC ATTGATCAAA GAAACAATCA AACAAAAATT 


TGGTGAATAA 


120 


AAAATTTGTT 


CAAGGGGGGT 


GGAAATCAAA 


TCCCCCTTTG 


AATTTATCAA 


TAGAGACACA 


180 


AATAATTTAG 


CTTTCTTATA 


AAAAAGTAGT 


ATCCTATGAA 


AGGAGTTAAT 


ATGGAAAAGC 


240 


AACAACCTAG 


TAAAGCAGCC 


CTGCTGTCTA 


TCATTCCTGG 


GTTAGGACAG 


ATTTACAATA 


300 


AACAAAAAGC 


CAAAGGTTTT 


ATCTTCCTTG 


GTGTAACCAT 


CGTATTTGTC 


CTTTACTTCC 


360 


TAGCACTTGC 


AACCCCTGAA 


TTGAGCAACC 


TCATCACTCT 


TGGTGACAAA 


CCAGGTCGTG 


420 


ATAATTCCCT 


CTTTATGCTG 


ATTCGTGGTG 


CCTTCCATCT 


AATCTTTGTA 


ATCGTTTATG 


480 


TACTCTTTTA 


TTTCTCAAAT 


ATCAAAGATG 


CACATACGAT 


TGCAAAACGC 


ATTAACAATG 


540 


GAATTCCAGT 


TCCACGCACA 


CTCAAAGACA 


TGATCAAAGG 


GATTTATGAA 


AATGGCTTCC 


600 


CTTACCTCTT 


GATCATTCCA 


TCTTATGTTG 


CCATGACCTT 


CGCGATTATC 


TTCCCAGTTA 


660 


TCGTAACCTT GATGATCGCC TTTACCAACT ACGACTTCCA ACACTTGCCA CCAAACAAGT 


720 


TGTTGGACTG 


GGTTGGTTTG 


ACCAACTTTA 


CAAACATTTG 


GAGCTTGAGT 


ACCTTCCGTT 


780 


CTGCCTTTGG 


TTCTGTTCTT 


TCTTGGACTA 


TCATTTGGGC 


TTTGGCAGCT 


TCTACTTTAC 


840 


AAATCGTAAT 


TGGTATCTTC 


ACAGCTATCA 


TTGCCAACCA 


ACCATTTATC 


AAAGGAAAAC 


900 


GTATCTTTGG 


TGTTATTTTC 


CTTCTTCCTT 


GGGCTGTCCC 


AGCCTTCATC 


ACTATCTTGA 


960 


CATTCTCAAA 


CATGTTTAAC 


GATAGTGTCG 


GTGCTATCAA 


CACTCAAGTA 


TTGCCAATCT 


1020 


TGGCTAAATT 


CCTTCCTTTC 


CTTGATGGAG 


CTCTTATTCC 


TTGGAAAACA 


GACCCAAOTT 


1080 


GGACTAAGAT 


TGCCTTGATT 


ATGATGCAAG 


GTTGGCTCGG 


ATTCCCATAC 


ATCTACGTTC 


1140 


TGACCTTGGG 


TATCTTGCAA TCTATTCCTA 


ACGACCTTTA 


CGAAGCAGCT 


TATATTGACG 


1200 


GTGCCAACGC 


TTGGCAAAAA 


TTCCGCAACA 


TCACTTTCCC 


AATGATTTTG 


GCTGTTGCGG 


1260 


CACCTACTTT 


GATTAGCCAA 


TACACCTTCA 


ACTTTAACAA 


CTTCTCTATC 


ATGTACCTCT 


1320 


TCAATGGTGG 


AGGACCTGGT 


AGTGTCGGAG 


GTGGAGCTGG 


TTCAACCGAT 


ATCTTGATCT 


1380 


CATGGATCTA 


CCGTTTGACA 


ACAGGTACAT 


CTCCTCAATA 


CTCAATGGCG 


GCAGCTGTTA 


1440 


CCTTGATTAT 


CTCTATCATT 


GTCATCTCAA 


TCTCTATGAT 


CGCATTCAAG 


AAACTACACG 


1500 
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CATTTGATAT GGAGGACGTC TAAGATGAAT AACTCAATTA AACTCAAACG TAGACTGACT 1560 

CAAAGCCTTA CTTACCTTTA CCTGATTGGT CTATCAATTG TAATTATCTA TCCACTGTTG 1620 

ATTACCATTA TGTCAGCCTT TAAAGCAGGT AACGTCTCAG CCTTTAAACT AGATACTAAT 1680 

ATCGACCTCA ATTTTGATAA CTTTAAAGGG CTCTTCACTG AAACCTTGTA CGGTACTTGG 1740 

TACCTCAACA CTTTGATTAT CGCCTTAATT ACCATGGCTG TTCAAACAAG TATCATCGTA 1800 

CTTGCTGGTT ATGCTTACAG CCGTTACAAG TTCTTGGCTC GTAAACAAAG TTTGGTCTTC 1860 

TTCTTGATCA TCCAAATGGT GCCAACTATG GCCGCTTTGA CAGCCTTCTT CGTTATGGCG 1920 

CTTATGTTGA ACGCCCTTAA CCACAACTGG TTCCTCATCT TCCTCTACGT TGGTGGTGGT 1980 

ATCCCGATGA ATGCTTGGCT CATGAAAGGC TACTTCGATA CAGTGGCAAT GTCTTTAGAO 2040 

GAATCTGCAA AACTAGACGG TGCAGGACAC TTCCGCCGCT TCTGGCAAAT TGTTCTACCA 2100 

CTTGTTCGCC CAATGGTTGC CGTACAAGCT CTCTGGGCCT TCATGGGACC TTTCGGGGAC 2160 

TACATCCTCT CTAGTTTCTT GCTTCGTGAG AAAGAATACT TTACTGTTGC CGTAGGTCTC 2220 

CAAACCTTCG TTAACAATGC GAAAAACTTG AAGATTGCCT ACTTCTCAGC AGGTGCTATC 2280 

CTCATCGCCC TTCCAATCTG TATTCTCTTC TTCTTCCTAC AAAAGAACTT TGTTTCAGGA 2340 

CTTACAAGTG GTGGCGACAA GGGATAATTT ATCCCCGCCA CCCTTTTTCA TTTTATACTC 2400 

TTCGAAAATC TCTTCAAACC ACGTCAGCTT TATCTCCAAC CTCAAAGTTG TGCTTTGAGC 2460 

AACCTGTGGC TAGTTTGCAC TTTGATTTTC ATTGATTATT AGCAATTGTC ACTGTAAATA 2520 

ATATCCTTGT AGCAAGCAAT TTTTCTCCTA GACTTGAAAT AAAGCGCATT TCTCTATATA 2580 

ATAATACTCA TATAGAAAAC ACCTTTTAGA AAGATACCTA TGCTTCCATA TCCATTTTCC 2640 

TATTTTTCAA GTATTTGGGG GGTTCGTAAG CCCCTGTCCA AACGTTTCGA GCTCAACTGG 2700 

TTTCAACTTC TCTTTACCAG TATCTTCCTT ATCAGCTTGT CTATGGTACC CATTGCTATC 2760 

CAAAACAGCT CCCAGGAGAC CTATCCGCTA GAAACTTTTA TCG ATAATGT CTATGAACCT 2820 

CTGACAGATA AGGTTGTCCA GGATCTCTCT GAACATGCTA CAATTGTCGA TGGCACATTA 2880 

ACTTATACTG GAACAGCTAG TCAAGCCCCT TCTGTTGTGA TTGGTCCAAG TCAAATCAAG 2940 

GAATTACCTA AGGACTTGCA ACTGCATTTC GATACAAATG AGCTAGTCAT CAGCAAGGAA 3000 

AGCAAGGAAC TGACCCGCAT CTCTTACCGA GCCATTCAGA CTGAGAGTTT CAAAAGCAAA 3060 

GACAGCTTGA CCCAAGCAAT TTCTAAAGAC TGGTACCAAC AAAATCGTGT CTATATCAGC 3120 

CTCTTCCTAG TTCTCGGTGC GAGCTTCCTC TTTGGTTTGA ATTTCTTTAT CGTCTCTCTT 3180 

GGAGCTAGCT TTCTCCTTTA TATCACCAAA AGATCACGGC TCTTTTCATT TAATACCTTT 3240 
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AAAGAGTGCT ACCATTTTAT CTTGAACTGT TTAGGATTGC CGACTCTGAT TACACTTATT 3300 

TTGGGATTAT TTGGCCAAAA TATGACAACC CTGATTACTG TACAAAATAT TCTTTTTGTT 33 60 

CTGTATCTGG TCACTATCTT TTATAAAACA CATTTCCGTG ATCCAAATTA CCATAAATAG 3420 

GAGATTTTTA TGCCCGTTAC GATTAAAGAC GTGGCCAAGG CTGCTGGTGT TTCGCCTTCA 3480 

ACCGTAACCC GTGTTATTCA AAATAAATCA ACCATTAGCG ACGAAACAAA AAAACGTGTT 3540 

CGCAAAGCTA TGAAGGAACT CAACTACCAC CCAAACCTCA ACGCTCGTAG CTTGGTAAGC 3600 

AGCTATACTC AGGTTATCGG ATTAGTTCTT CCTGATGACT CAGACGCCTT CTACCAGAAT 3660 

CCTTTCTTTC CATCGGTTCT ACGTGGCATC TCTCAAGTCG CATCTGAAAA CCACTATGCC 3720 

ATTCAGATAG CAACAGGGAA AGATGAGAAG GAGCGTCTCA ACGCTATTTC ACAAATGGTC 3780 

TACGGCAAGC GTGTAGATGG GCTAATTTTT CTCTATGCCC AAGAAGAAGA CCCTCTCGTA 3840 

AAACTCGTCG CAGAAGAACA GTTCCCCTTC CTTATCTTAG GTAAATCTCT ATCTCCTTTC 3 900 

ATCCCACTTG TCGACAACGA CAATGTTCAA GCTGGTTTTG ATGCGACTGA ATATTTCATC 3960 

AAAAAAGGCT GCAAACGCAT TGCCTTTATC GGAGGAAGTA AAAAGCTCTT CGTGACCAAA 4020 

GACCGTTTAA CAGGCTATGA ACAGGCGCTT AAACATTACA AACTTACCAC TGACAACAAT 4080 

CGCATCTACT TTGCCGACGA GTTTCTGGAA GAAAAGGGCT ATAAATTTAG CAAGCGATTA 4140 

TTCAAGCACG ATCCACAAAT TGATGCTATC ATCACAACCG ATAGCCTCCT AGCTGAAGGT 4200 

GTTTGTAACT ATATTGCCAA ACACCAGCTG GATGTCCCTG TTCTCAGCTT TGACTCGGTT 4260 

AATCCCAAGC TCAACTTGGC AGCCTATGTC GATATCAATA GTTTAGAGCT TGGTCGTGTT 4320 

TCCCTTGAAA CTATTCTCCA GATTATTAAT GATAATAAAA ACAATAAACA AATTTGTTAC 4380 

CGTCAATTGA TCGCCCACAA AATTATCGAA AAATAAGAGA CTGGGCAAAA AGTCGTTAAA 4440 

AGCAAAAACG CATACTATCA GGTATTGAAA AAACTTGATA CTATGCGTTT TATTGTGGGA 4500 

AGATTTACTT CCTTTTCTAC TGAAATTGAG TCTTTTCCCA AGATCTTTTT ATACTCAATG 4560 

AAAATCAAAG TGCAAACTAG GAAGCTAGCC GCAGGTTGCT CAAAACACTG TTTTGAGGTT 4620 
GTAGATGAAA CTGACGAAGT CAGTAACCAT ACCTACGGCA AGGTGAAGCT GACGTGGTTT - 4680 

GAAGAGATTT TCGAAGAGTA TTAATCACTA ATTATCTATC TCAACAAATC TTCCTAGAAT 4740 

ATGAACATTT TCCGAGACAG AGACAAAGGA GCTTGGATCC ACTTGTGTCA TAATCTGTTT 4800 

AAATTCATTA AACTCTGCAC GTGTAATGAC AGTGATTAAA ACTGCCTTTC TCTCGTGATT 4860 

ATAGGTTCCT TCTGCATCGT GGATCATGGT TGCTCCGCGG TGCAATTTTT TATGGATTTT 4920 

TTCAATTACC TTCTCTGGAT GATTTGTCAC AATCATGGCC TGCATACGCT TTTGCTTAGT 4980 

AAAGACTGCG TCTGTCACAC GGCTAGAGAC AAAGATGGTA ATCATAGAAT AAAGAGCGTA 5040 
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TTTCCAACCA AAGGTCAAAC CTGCTATCAG CATGATAGTT CCATTTACCA AGAAAGAAAT 


5100 


ACTACCGACA 


TTCTTACCCG 


TTTTCTTACG 


AATAGTCAGG 


CTGACGATAT 


CCGTCCCACC 


5160 


ACTGGAGATA 


TTGTTTCGAA 


GAGCAAAACC 


AATCCCCAAA 


CCCATAACAA 


CACCCCCAAA 


5220 


AAGGGAATTG 


ATAATGGGAT 


CCTCTGTCAA 


GGTTGCCACA 


GGGACAAACT 


GGATAAAGAA 


5280 


GGAACTCATA 


GATACCGTGA 


TAAAGGTAAA 


GACGGTGAAC 


TTATGGCCAA 


TCTGATACCA 


5340 


AGCTAAGACC 


ATCAAAGGGA 


AGTTAATGGC 


GTAGAAGCTT 


AGCGAAATCG 


GAATATGAAA 


5400 


ACCAAACCAG TGATTACTCA 


AGGCAGAGAT AATCTGTGCC 


AGACCTGTTG 


CACCACTCGA 


5460 


ATACACATGC CCTGGTTGGA AAAAGAAATT AACTGCTACT GCTGATAAAA AACCATAGAG 


5520 


CAGAGAGGCC 


GAAATCTTCT 


CATCATACTT 


TTCTCGAGAG 


ATACTTTGTA 


AGACACGTAA 


5580 


AATTTTTATC 


TGATAAGCAA 


AGCGGCGCAG 


ATAATAGOGC 


CACCGCTTAA 


TTCGTTTTGT 


5640 


TTGTTTCATC 


TTCTTCTACT 


TGTAAGCTGA 


GTTCCTCTAG 


TTGTTTGAGA GCGACTGTTG 


5700 


ATGGAGCTTG 


TGTCATTGGG 


TCAGTTGCCT 


TGTTGTTCTT 


AGGAAAGGCA 


ATGACTTCAC 


5760 


GGATATTTTC 


TTCTCCAGCA 


AGCAACATGA 


CAAAACGGTC 


AAGCCCGATA 


GCCAAACCAC 


5820 


CGTGTGGTGG 


GAAACCATAG 


TCCATGGCTT 


CAAGAAGGAA 


ACCAAACTGG 


TCATTGGCTT 


5880 


CTTCAGTTGA 


GAAACCAAGA 


GCCTTGAACA 


TGCGTTCTTG 


AAGGTCTTTT 


TGGTTGATAC 


5940 


GAAGGCTACC 


ACCACCAAGC 


TCATAACCGT 


TCAAGACGAT 


ATCGTAAGCA 


ATGGCACGAA 


6000 


CCTTAGCCAA 


ATCACCTTCT 


AATTCATGAG 


CAGTCTCTTG 


CTGTGGAAGT 


GTGAAAGGAT 


6060 


GGTGGGCGCT 


GATGTAGCGG 


CCTTCTTCTT 


CAGACCATTC 


AAACATGGGC 


CAGTCAACCA 


6120 


CCCAAAGGAA 


GTTGAACTTA 


TCATTATCAA 


TCAAGCCAAG 


CTCTTTAGCA 


ATACGTCCAC 


6180 


GAAGGGCACC 


CAGTGTTGCA 


TTAGCCACTT 


CAAGCGTATC 


CGCCACAAAG 


AGAACCAAGT 


6240 


CCTTATCTTC 


AAGAACAAGC 


GCTGTTGTCA 


ATTCTTCTTG 


GATACCAGTC 


AAGAACTTGG 


6300 


CAACTGGTGC 


GTTTAATTCT 


CCATCAACCA 


CCTTGACCCA 


AGCAAGACCT 


TTGGCAGCAT 


6360 


ACTGTTTGGC 


TACTTCCGTC 


ATCTTGTCGA 


TGTCTTTACG 


TGAATAGTTG 


TCCGCAGCTC 


6420 


CTGTGACCAC AATCGCTTTT ACAGCAGGTG 


CTTCTGAAAA 


GACTTTAAAG 


TCTACACGTC 


6480 


GGACCACTTC 


TGTCAAGTCC 


TGAAGCAACA 


TGTCAAAACG 


AGTATCTGGC 


TTGTCAGAAC 


6540 


CGTAAAGAGC 


CATAGCATCA 


TCGTATTTCA TACGAGGGAA 


TGGTAGCGTT ACTTCGATGC 


6600 


CTTTTGTTTC 


CTTCATCACG 


CGCGCGATCA 


AGCTTTCTGT 


AATATCTTGG 


ATTTCTTGCT 


6660 


CAGTAAGGAA GGACGTTTCC 


AAGTCGACCT 


GAGTAAATTC 


AGGCTGGCGG 


TCTCCACGCA 


6720 


AGTCCTCGTC 


ACGGAAACAT 


TTAACGATTT 


GGTAGTAACG 


GTCAAAAGCA GCATTCATGA 


6780 
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AGAGCTGTTT CGTGATTTGT GGACTTTGAG GAAGAGCGTA AAAATGCCCC TTATTAACAC 6840 

GAGACGGCAC TAAATAATCA CGCGCCCCTT CAGGCGTTGA CTTAGAAAGG AATGGTGTCT 6900 

CCACGTCGAT AAACTCCAAC TCATCCAAGT AGTTGCGGAT AGAGTGGGTC ACCTTGGCAC 6960 

GAAGTTTAAG ATTTTCCAAC ATTTCTGGAC GACGAAGGTC AAGGTAACGG TAACGCAAAC 7020 

GTGTATCGTC ATTTGCCTCA ATGCCATCCT TAATCTCAAA TGGTGTTGTC TTAGCTGTGT 7080 

TAAGCACAAT AAGAGCTGTC ACGTTTAACT CAACCGCACC AGTTGGCAAC TTATCATTGG 7140 

CTTGTCACGC GCAGCGACCT GACCAGTCAC CTCAATAACA AATTCGCTAC GAAGGcTTTC 7200 

AGCTGTTGCC ATAACCTCTG CAGATACTTT TTCAGGGTTG ATAACCAACT GCATGATTCC 7260 

TTCACGGTCA CGAAGATCGA TAAAGATCAA ACCACCAAGG TCACGACGAC GGCCAACCCA 7320 

TCCTTTCAAG GTTATTTCTT GTCCGATGTG TTCCTCACGA ACACGACCAG CATACATACT 7380 

ACGTTTCATT ATTTCTCTCC TCTTTTATTC TGTTACTATT TTACCATAAA AGCGCAGCTC 7440 

' TTCATGAAAA TCATCAGAAA AGTTTGCCAG TCTTTAAAAG TCAGGTGAAA GCCCTAAAAA 7500 

TTAGCGCTAA TACTCTTCGA AAATCTCTTC AAACCACGTC AGCGTCGCCT TACCGTATGT 7560 

ATGGTTACTG ACTTCGTCAG TTTCATCTAC AACCTCAAAA CCATGTTTTG AGCTGACTTC 7620 

GTCAGTTCTA TCCACAACCT CAAAACAGTG TTTTGAGCAA CCTGCGGCTA GCTTCCTAGT 7680 

TTGCTCTTTG ATTTTCATTG AGTATAATAC AAAAATCCGA TGAACTTCAC CGGACTCTTT 7740 

TATTTTGAAT TTTTGCCTGC TTTACGCTTT TCAGCGATTT CGGCTGCCTT TCGAGGCAAG 7800 

ACAATTTCCG TTATGTAAGC CGTCCCAAAA CGCAGTACAC CTGCAATAGG AGCAAAGACA 7860 

ACTGCTAGAT AGTTATAGAA GAAATCGCCT TTGAAGGCAT AAGCTAGCGC TCCAATGATG 7920 

AAAAATAGAA CGACTGCCTG AATCACTGCT AATAAAATTA CTCGTTTCAT GTGACCTCCT 7980 

GACTCTATTA TAGCATGAGA ATCATCAAAA AGCCGACTAA ATTATTCAAA GCGTGAAGAG 8040 

AAATACTGTA GACCAGACCT TTTCTGCTAA TGTAAGCCAA ACCCAAACTA AAACCAAGGC 8100 

TAAAATAGAC AAAAAATTGT TGCACATCAC CTGGAAAATG AATCAAGGCA AATAGAAGAC 8160 

TAGATACCAG AAGAAAAATC AGGGTTCGTT TACTATTGTC CTGCTTAGGA AAGAGATAGC 3220 

GTGCTAACAT CCCTCTAAAA ACAATCTCTT CCGTCAAAGG AGCAAAAATA ACCACAGCAA 8280 

AGAATGAGAA AAGTGGTTGA GACAAGGTCA AGTCTGTCGC TATTTGCTGA TTTACTGAAG 8340 

GATCATCTGG CAAGAAGAAT TGAACGACCA GAGATAAGAA CCAAACCAAG ACAGGAAGCC 8400 

AAATAAATCG ATTAAAGCCG CTCTTCTCAA TATGAACAGG AGCCTTCTGA TACCATTTGT 8460 

AAATGCCGTA CACATATACT CCAGCCAAGG CCACATAGAG TAGAGTAACA GCATAGGGTG 8520 

AAGCGCCTAA AGCAAGCGAC GCAGTCGCGA GCCCCTGAAT AAAGCCATAG ATAAATAAAA 8580 
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AGGATAGAAG GGCTAGAAGA 


ATCCAGCCAA 


GGTTTTTAAG 


TAATTTCATA GATAACTCCT 


8640 


TTATTTGAAA TAACGTTTTA 


CCATAGGTAA 


CTGCATCACA 


TTGATATAAA CATGGATGGC 


8700 


TCCTACAAGC AAGAAAGCTA 


GTAACTGAAT 


CTCTCCTGTC 


AAGAAAGAAA TGATAATAAG 


8760 


AAAAATATAT AAGGCTGGTA AGACATATTG 


GTGTAATTGG 


AATAAAATTC GAAAACTCTG 


8820 


TTCCAAATTA GCCTGACGCT 


CCCCTTCATC 


ATAAGAATTT 


ATATAGTTCA AGACATCCTT 


8880 


TGGTGTAGCG AAAAATTCCA AATCAAACTG 


ACGAACAATC 


GCAATGGTTT TAAAAAGAGA 


8940 


TTTTTGAGCG ACTAAGAATA 


CCACAAAGAG 


TAAGAAAGAA 


AGGAAAAATG TTTGAGGGTT 


9000 


TGTATGCAAT ATAATCACCT CACTTAATGA AATAAAAATA GCCAATGGAA TCGCTACACC 


9060 


TGTAATATTA AAAGCAATGG 


TTCCAAACTC 


AAGATTCCGA 


TACATTTGCA CATAATAGGT 


9120 


TTCATTCAGA TCGTCATCCA 


TTTCCTCTTG 


ATACAAAGAA 


TGAAATTTTC TGCTTTTCTT 


9180 


TAAGAAATTG AAAGTCAAAA ACATACTAAT GAAACCTATC AGTAAACAAA TAGCTGATAT 


9240 


CCATGGCATC AAGGCTTTTA 


CATCTAAAAT 


AATTTCGTGG 


GATTCGACAC GTGCCTTAAA 


9300 


CATCCCTACA AACATGCCCA AGAACCCCCC AAGACAATAG ACATCAAAAA TAACAATCTA 


9360 


CGTTTCTTTT TCATATTCAT 


TCTCCTTTTT 


CACTTGCTAG ATTTTTGGAT TTCTTTTCAA 


9420 


TCCATTCAAT TACTGGGATG 


AGAGCAAAGT 


AGACCCAAAC 


AAATTGGTCG CTTTGATAGG 


9480 


GATTAAACCA GCTTAGGTCC 


ATCCCAATCA 


GTAGAAATAC 


GCTGACTAAT AAAGCTATGA 


9540 


CCACTACATA ATAAATCACT 


TTATACTTGT 


TCATCACTCG 


TCCTCCTCCA AACGAAATAC 


9600 


CGATTCGACT GTTTCGTTGA 


AAATTTGAGA 


TATTTTCAGG 


GCAATGATAA TGGATGGGGT 


9660 


GTACTCATCC CGTTCTAGTA 


GGCTAATGGT 


CTGTCTGGAA 


ACCCCTGCCA GTTTGGCTAG 


9720 


GTCGGTTTGA TTGAGACCAT 


CGCGAGCTCG 


AAGCTCTTTT 


AGACGATTTT TTAGTTGCAT 


9780 


GTTACACACC TACTCTCCGT 


CAAATTCAAC 


GGTTTGGATA 


TCCTCAATAC GTTGCAACTT 


9840 


GAATTTTTCT TTTCCCGTAT 


TATCTACACG 


TCGTAGCTTT 


ACCCATTCCT CATCAACATC 


9900 


CACAACTTCC CAGTTATCTG 


GCCCAATATA 


CACTCCCGTT 


ATAATTGGTT CCTTTCCAAT 


9960 


CATTTCTTGT AATAATCTCG 


ACATTTCTGC 


GTTTCCTTTC 


TCTTTTCGCT CAAGTCTTTT 


10020 


GATTTTATTC TCTAGTTTCT TGATTTTTTT AGAATTATTA GAATAAAAGA AAATCATAAA 


10080 


TAGTATAAAT CCTAGTACCC 


ACATTATAAC 


TCCTTTCTGC 


TTCCTATTTC TTAACTTGAA 


10140 


TTCATTGTAA CATATCTTTT TCTTTTTGAC AAGTATAGTT GTCAAAAAAA TTATGATTTT 


10200 


TGTCATTTTG CAAAAGAAAA AGGTCAGGAG 


TAGGTTCCTG 


ACCACTTTAT CTATCATTAA 


10260 


TACTCTTCTA AAATCTCTTC AAACCACGTC AGCTTCACCT 


TGCCGTAGGT ATGGTTACTG 


10320 
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ACTTCGTCAG TTTCATCTAC AACCTCAAAA CCATGTTTTG AGCTGACTTC 


GTCAGTTCTA 


10380 


TCCACAACCT 


CAAAACCATG 


TTTTGAGCTG ACTTCGTCAG TTCTATCCAC 


AACCTCAAAA 


10440 


CCATGTTTTG 


AGCTGACTTC 


GTCAGTTCTA TCCACAACCT CAAAACAGTG 


TTTTGAGCAA 


10500 


CCTGCGGCTA GCTTCCTAGT TTGCTCTTTG ATTTTTATTG AGTATAAAAT 


CCTAGTTTTT 


1O560 


CAAAGATTTC 


TGAGAAGTTT 


TGGCTGATTG TCTCAAGTGA CACTTGCACT 


TCTTCTCGGG 


10620 


TTTGGTTGTT 


CTTGACCGTC 


ACTTGTCCGC TTTCGACTTC GCTCTCTCCT 


AGGGTGATGA 


10680 


GGGTCTTAGC 


CGCAAAGACA 


TCGGCTGACT TGAACTGAGC TTTTAGTTTA 


CGGTTGAGGT 


10740 


AATCACGCTC 


TGCTTTGAAA 


CCTTGTTGGC GAAGAGCCTG TACCAATTCC AAGGCCTTGA 


10800 


TATTTGCCCC 


TTCGCCCAAG 


ACTGCGATAT AG AC AT CT AG GGCGTTTTCG 


ATAGGGAGGG 


10660 


TCACACCTTG 


CTTTTCAAGG 


ATGAGAAGCA GGCGCTCTAC ACCAAGTCCA 


AAACCAAATC 


10920 


CAGCAGTTTC 


AGGGCCTCCA 


AAGTAAGCAA CCAAACCATC GTAGCGACCA 


CCCGCACAGA 


10980 


CGGTCAGGTC 


ATTGCCCTCA 


ATCTCTGTGA TAAACTCGAA AATGGTGTGG 


TTGTAGTAGT 


11040 


CCAGACCACG 


CACCATATTG 


GTATCGATGA TGTAATCTAC TCCAAGATTT TCCAACATCT 


11100 


GACGCACAGC 


ATCAAAATGA 


GCTTGGCTTT CTTCATCAAG AAAGTCCAAG ATAGACGGCG 


11160 


CATTCTCTAC 


TGCCACCTTG TCTTCTTTTT CCTTAGAGTC CAAGACACGA AGAGGATTTT 


11220 


CCTCCAAGCG ACGTTGGCTA TCCTTAGACA AGGTCTCCTT GAGCGGTGTC 


AAATAGTCAA 


11280 


TCAAGGCTTG GCGGTAGGCT GCACGGCTCT CAGGATTTCC AAGAGTGTTG 


AGGTGCAATT 


11340 


TGACACCTTG 


AATACCGATT 


TCCTTCAAAA AATGGGCTGC CATAGCGATT 


GTTTCCACAT 


11400 


CGGTAGCTGG ATTGCTAGAG CCAAAACACT CAACACCAAT CTGGTGGAAT 


TGGCGCAAGC 


11460 


GCCCTGCCTG 


TGGACGCTCA 


TAACGGAACA TAGGTCCCAT GTAGTAGAAC 


TTGCTTGGCT 


11S20 


TTTGCACTTC 


TGGGGCGAAA AGTTTATTTT CCACATAGGA ACGGACAACG 


GGTGCAGTTC 


11580 


CTTCTGGACG 


GAGGGTAATA 


TGACGGTCAC CCTTGTCATA AAAATCGTAC 


ATTTCCTTGG 


11640 


TTACGATATC 


CGTTGTATCT 


CCGACAGAGC GACTGATAAC CTCGTAATGC 


TCAAAAATAG 


11700 


GCGTGCGCAC 


TTCTGCATAG 


TTGTAGCGTT TGAAAATCTC ACGGGCAAAG 


CCCTCAACGT 


11760 


ACTGCCACTT 


AGCAGACTCA 


GCAGGTAAAA TATCCTGCGT TCCTTTTGGT 


TTTTGTAATT 


11820 


TCATAGGGAA 


TCCTCTTTAA 


ACTTAATAGT CTTATTTTAC CATAAATAGA 


GGGATTAAAA 


11880 


CAGTAAGAAA 


AAAATTAGGA TTTAGATATC ATTTTTGAGA TTAAGAATTG TCAAAAAAAT 


11940 


AGCTAGCAAG 


GAAAGACCAA 


CAAATAGCAT CCAAGTCAAC TGTATATTCC 


ATACGGCTAC 


12000 


TAGTGAAAAA 


CAAGCTGTTC 


CCACAGGTAT GGATAAGGTA AACAATAGAC 


CTAAAAAATT 


12060 


ACTAGTACGA GCTAGAACCT CTGGAGCTAG ATTTTTCATG AGCATGGCAC 


TAATCTTTGG 


12120 
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TTGAACTTTA CCAGACACAT ACAGAGTAAA GAAGAGAAAT AGCAAACCAA GCACGACTTG 12180 

ATTGAATAAA TTAGCCAAAC CAACTAGACT AAGTCCTACG GTCTCCCACA TCATCAATCT 12240 

AGGCAAGGAC TGCTTCCCAA AATAATCATT GCCCGTAAGG CTACTGATGA TGACTGATAC 12300 

TAAAACACAG AATTGATTGA TAAATAGTGC CTCTGTATAA GAAAAATTCA AGAGAGAATG 12360 

GCTCAAAAAG AAGATATTAT AAATTCCACC CAAAGCGCCA CCCAAGGAAT TAATAAGCAA 12420 

GACAGCAAAG AGCATAAAAC CAAAGTTTTT CTGTCCACTT TTAAGAAAAA CGAGACGTAA 12480 

ATTTCGGTAA ATTGTTAGGA ACTGGTCTTT GATAGAAAGC TTCTCATTTT TTAAGTTTTC 12540 

ACCATCAGCA GATGACATTG ACAGGCTCAA TTTGCTTTTT CCTAAAAAGA GGATAGTGGC 12600 

TGATACTAGG AAAAAGCAGG CATTGATTCC CGCAACGAGA GAAAAATTGT TGACCGATAG 12660 

AGCTAAGAGC CAGACTCCGA AAGCTTGACC ACCAATAGCT GAAATATAGG TGATGAACTG 12720 

TGAAAAAGAA TAAGCCTCCA TCAGATCATC TTCAGCTACT TTTTCCTTAA TAAGAGGCAT 12780 

ACGCAGGCCA CCTGCAAAAT CACTGATGAT ATCACTAATG ACATTGATCA AACACAGGCT 12840 

AGAAAAGGCA AAGAGACTAG CTTGCTGAAC AACTAGGGCT GCTAGAAAAA ATAGAACCGC 12900 

CTGAAACAAA CCGCTATAGA CCATCCATTT GACCTTGTCC CTCGTGTAAT CTGCCCGAAT 12960 

CCCTGCAAAA ACTGTAAAGA GGGTCGGAAG AATCATGACA ATATTCGCGA TAGCAACAGC 13020 

AAAAGATGCT TGTGACAAGG TCGATGCATA GACGATAAAG ACCAGGTTGA AAATCGAAAC 13080 

ACCAAAAGCA TTGAAGAAGC GTGG 13104 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19250 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

CCGGGCAAAT AGTTTTGAAC TTTTCATCAT TTTCTCCTTT AAAACTTTCT CTCCATTATA 60 

GACTCTTTTC AGAAAGTTGT CAACAGAATT TTCAGAATTT TTGAAAATTA TTTTTCAAAC 120 

AACATCTTTG CAAAAAATAT GAATATCGTA AGCGCGTCAT AACAAGGTAT CTATCATTCA 180 

TGGAGCTCCT CCTGTATACT ATTAGTAAAG TAAATATTGG AGGATATTTT AATGCCACAA 240 

CCTATTGTTC CTGTAGAGAT TCCACAATCT CGTGGTTTTG ATTCTAAAAA GAGAAATGAT 300 

ATTCTrCTTA AAATTCGTAT TGGCAAGCTT GAAGTAAGTT TTTTTCAATG TCTCAATCTG 360 
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GAAATGATAG AACAGCTTTT GGATAAGGTG TTGCTCTATG ACAATTCATC TATCTAGCCT 420 

AGGGCAGGTC TATCTCGTGT GTGGGAAAAC TGATATGAGA CAAGGAATCG ATTCACTGGC 480 

TTATCTCGTT AAAACCCACT TTGAATTGGA TCCTTTCTCC GGTCAAATCT TTCTCTTTTG 540 

TGGTGGACGT AAAGACCGCT TTAAAGTCCT TTACTGGGAT GGTCAAGGAT TTTGGCTACT 600 

ATATAAACGC TTTGAGAACG GCAGACTGAC TTGGCCCAGT ACAGAAAAGG ATGTCAAAGC 660 

TCTCGCACCT GAACAAGTAG ATTGGCTGAT GAAAGGCTTT TCTATCACTC CAAAAATATA 720 

GTAGATTGAA ACTAGAATAG TACACCTCTG CTTCTAAAAC ATTGTTAGAA ATCGATTTTA 780 

CTGTCCTGAT CGATTTGTCC TGTTATTATT TCATTTTACT ATAAATCCAT CAGAAAGTCG 840 

TGATTTCTAT TGAAATGAGG ACTTTCTTTT TATACTCATC TGCTTTCAAA AAGCACTCTA 900 

GTCCATCTCC GATTAACGAT GGACTTTATC ACCTCCTTCT CCAGTCCTTG TATAACATCT 960 

TGAAGTTGAT TCATGACATC TTCCAAAGTT CGAAAGGCTT TATTCTTAAA TCCACGTTTA 1020 

CGAATCTCTT TCCACACTTG TTCAATGGGG TTCATCTCTG GTGTGTATGG AGGAATAAAT 1080 

GCAAAGCCAA TATTAGTCGG AATCTTTAAG GTACTTGATT TATGCCATAT AGCATTGTCC 1140 

ATAACGAGTA AAAGATAATC ATCTGGATAA GCTTGTGAAA GCTCCTATTC CTAAAGCCCC 1200 

TTTATAACCT CTTGCGAGAG AGACTATTGA CTCAGCCCTT ACTTCATGCG GATGAAACCT 1260 

CCTATCGGGT TCTAGAGAGT GATAGCCATC TGACCTACTA TTGGACTTTT TTGTCAGGTA 1320 

AAGCAGAGAA ACAAGGGATT ACGGTTTACC ACCATGATCA GTGTCGAAGT GGTTCAGTAG 1380 

TACAAGAATT CCTAGGAGAT TATTCTGGCT ATGTTCATTG TGATATGTTG CGGCAGTAAC 1440 

TTAGGACTTT AGTCCTCTAG TTCTGCCTAT GCGATAGCAG TCCAAGGTTT AGGAGTAAGG 1500 

CGACGCTAAG CTTGGTAAAC TGCGAACAGC TAGAAGCTTA TCGTCAACTG GAAGAAGCTG 1560 

CACTTGTTGG ATGTTGGGCG CATGTGAGAA GGAAGTTTTT TGAAGTGCCC CCCAAGCAAG 1620 

CAGATAAATC ATCCTTAGGA GCTAAAGGTT TAGCCTATTG TGATCAGTTA TTTTCCTTGG 1680 

AAAGAGACTG GGAGGCTTTG CCAGCTGATG AACGGCTACA GAAACGTCAA GAACATCTCC 1740 

AACCCCTACT GGAAGACTTC TTTGCTTGGT GCCGTCGTCA GTCAGTTTTA TCGGGTTCAA 1800 

AACTAGGAAG GGCAATTGAA TACAGCCTCA AGTATGAAGA AACCTTTAAG ACCATTTTAA 1860 

AAGACGGACA TCTGGTCCTT TCCAATAATC TAGCTGAACG CGCCATTAAA TCATTGGTTA 1920 

TGGGACGGAG TAAAAGAGTC CAGTGGACTC TTTTAGCCTA AGCTCAGTTT AAAAAAACGA 1980 

GGGTGGTTAT TTTTAAAAAA GCGAGGGTGG TTATTTTCTC AAAGTTTTGA AGGAGCTAAA 2040 

GCAAGAGCTA TTATTATGAG TTTGTTGGAA ACAGCTAAAC GTCATCAATT ATAGTGCGTT 2100 

GAATCTATAA CAGTACGCAT CGACTGCTAA AATATTTCTA TAAATCAATT TTCCTTTCCT 2160 
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AATCGATTTG TTCATATCTT ATTACAATCC ATTATAAATA GCGAGAAATA TCTATCCTAT 


2220 


CTTCTAGAAT GTCTTCCAAA 


CGAGGAAACT 


CTCGTAAACA AAGAGGTTTT 


AGAGGCCTAT 


2280 


TTACCGTGGA CTAAAGTTGT ACAAGAAAAG TGCAAATAAG AAATCTCCAG ATTAGGAACT 


2340 


ATATATGAGT TCTCTAGTCT 


GGAGATTTTT 


CAATAGACTT CGTTATTGGG 


CGGTTACTTT 


2400 


CGAAACTTTG AAAACTTCAA 


AAAACGGATT 


TTTATCGCTC TGAACATCAA AAAAGAAAGG 


2460 


ACGAAATTTG TCCTTTCTCA 


AGCTTAGCTT 


TTCTTCAACG CACTACAGTT 


GACAAAGAGC 


2520 


CCTTTATTCT ATCAAACATG AAGCGCAAAA ACAAGCCAAA AATCCGATAG 


AATGGCTATC . 


2580 


CCTCGACTAT CAAGTAAGAC 


ATTTCCATCA 


AATACGTTCA ATTTTACTCT 


TGTTCTACTA 


2640 


AGAATTAATC ATCTCGTTTT 


GATTTATTAA 


AAATATACAA TTCAGCTTTT 


CCTCCAAACT 


2700 


ATTTTATCCA CTATCCCTGT 


ATAGCTCTGT 


ATTATCTTAA CAACTTTAGT 


AGAGACATTT 


2760 


TCCTCAACAT AATCCGGAAC 


CGGTAATCCA 


AAATCCTCAT CTTGTGCCAA 


GCTAACAGCA 


2820 


GTTTCAACTG CTTGAAGAAG 


AGAATTTTCA 


TCAATGCCTG CCAAAATAAA 


TCCTGCCTTA 


2880 


TCTAAGGACT CAGGACGTTC 


TGTACTTGTA 


CGAATACATA CAGCGGGAAA 


AGGATAACCT 


2940 


TGACTAGTAA AGAAACTACT 


TTCTTCCGGT 


AAAGTTCCCG AATCAGATAC 


TACAACAAAT 


3000 


GCATTCATCT GTAAACAATT 


ATAGTCATGG 


AATCCTAGTG GCTCATGCTG 


AATCACACGT 


3060 


TTATCTAGTT TAAAACCGCT 


CTCTTGTAGC 


cttttctttg atctaggatg gcaagaatat 


3120 


AAGATTGGCA TATTATACTT 


TTCAGCTAAT 


TGATTAATTG CTGTAAAGAG 


AGAAATAAAA 


3180 


TTTTTATCTG TATCAATATT 


TTCCTCACGG 


TGAGCTGAAA GTAAGATATA 


ACCTCCTTTT 


3240 


TTCAATCCCA AACGTTCATG 


GATATCTGAA 


GACTCAATAG CAGATAAATT 


TTTATGTAAC 


3300 


ACTTCTGCCA TAGGAGAACC AGTTACATAT GTGCGCTCTT TAGGTAAACC ACACTCATGT 


3360 


AAATACTTAC GTGCATGTTC AGAGTATGCT AAGTTAACAT CTGAAATAAC ATCAACAATC 


3420 


CGACGATTAG TCTCTTCCGG 


TAGGCACTCA 


TCTTTACAGC GATTGCCAGC 


CTCCATATGA 


3480 


AAAATTGGAA TATGTAAACG 


CTTGGCAGCA 


ATAGCTGATA AACAAGAATT 


TGTATCCCCT 


3540 


AAAATCAATA AAGCATCTGG 


TTTAATTTGA 


TTCATCAATT TGTATGAAGT 


ATTAATAATA 


3600 


TTCCCTACAG TAGCACCAAG 


ATCATCTCCA 


ACAGCATCCA TGTATACGTC 


CGGAGTGTCT 


3660 


AACCCTAAAT TATCAAAGAA 


AATACCATTT 


AAATTGTAAT CATAGTTTTG 


TCCAGTATGT 


3720 


GCCAAAATAA CATCAAAATA 


CTTTCGACAT 


TTAGTGATAA CACTACTTAG 


ACGTATAATG 


3780 


TCTGGACGTG TTCCCACAAT 


AATCAATAAC 


TTAAGTTTGC CATTATCTTT 


AAAGTGAATA 


3840 


TCACTATAAT CTGTCTTAAT TTTCATTTAT TTCTCCACTT GTTCAAAAAA AGTATCTGGA 


3900 
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TGTCTAGGAT CAAATGACTC ATTAGCCCAC ATGACAGTAA TTAGATTTTC TGTATCAGAA 3960 

AGATTAATAA TATTATGTGC ATAGCCCGGT ATCATATGTA TTGCTTCAAT CTTATCGCCC 4020 

GACACTTCAA AGTTCAGAAT AGGATACTCT TGACCGTTTT CATCCAGCCC TATCCTACGC 4080 

TCTTGTATTA AAGCACGACC AGAAACAACC ATGAAAAATT CCCACTTAGA ATGATGCCAA 4140 

TGTTGCCCTT TGGTAATGCC AGGTTTAGAA ATATTAACAG AAAATTGACC CGTATTTTCT 4200 

GTTTTTAATA ATTCCGTAAA ACTACCTCGT TCATCTATAT TCATTTTTAG AGGAAACTTA 4260 

AACTTATCTA CTGGTAAATA AGATAGGTAG GTAGAATACA ATTTCTTTTT AAACGATCCC 4320 

TGAGGAATTT CAGGCATAAC TAAACTATCA GGCTGTTTTT TAAATGTTTC TAATAGAGAG 4380 

ACAATCTCTC CTAAGGTTGC ACGATGAGTC GTTGGTACGT AGCAGTAGTT TCCTGATGGG 4440 

CTAGGTAAGA TTTGTAATCC ATCTAGATTA CAACGATGAG GATTTCCTTC CAATGCAGTT 4500 

AGACACTCTT GTATCAAATC ATCAATATAC AGCAACTCCA ATTCTACACT TGGATCATTT 4560 

ACTTGAATAG GTAAATCGTG AGCTAGATTA TAACAGAAAG TTGCTACAGC AGAATTGTAG 4620 

TTAGGACGGC ACCACTTCCC ATAAAGATTC GGGAAACGGT AAACTAAGAC AGGTGCTCCC 4680 

GTTTTCTTTC CATATTCAAA GAAGAGTTCT TCCCCTGCTA GCTTAGATTG TCCATATATA 4740 

GAGTTTGAAA ATCGGCCTTC TAAACTAGCT TGAGTAGAAC TTGAGAGTAG AACAGGACAA 4800 

GTGTTTTCAT ACTTTTCTAA AATCTCCAAT AATCTACTTG AAAAACCGTA ATTTCCCTCC 4860 

ATGAATTCAT CAGGATTCTG TGGACGATTG ACACCAGCTA AATGGAATAC GAAATCGGCC 4920 

TTCTTACAAT ATTCATCTAA TAAAATCGGA TCTGTATCAC GATCATACTG AAAAATCTCT 4980 

CCAATCTCTA AATTAGGACG AGTCCTATCT CGTCCATCTT TCAAAGCTTC CAGAGTACAG 5040 

ATAAGATTTT TTCCTACAAA TCCTTTCGCT CCTGTGATTA AAATATTTTT AATCATGCCC 5100 

CCTCCTTATT TTATATGCTG TTTTAATAGT TAACTCTCTC GACAATACAT GATACATTAT 5160 

ATATCCTTGA TAATTTTAAT GTATCTTAAA AGATTTTACA TCTCTTCGTC TGCTACCATA 5220 

TCACGAATTG CTGTCTGTAT TTCATCTAAT TCTAGCAACT TTCTTTTAAC TTGCTCTACA 5280 

TCCATCAAAT CGGTATTATT ACTATTGAAT TCTGTCAACA AATTTCTATT CGTACTACCA 5340 

TCTTTGAAAT ACTTATCATA GTTAAGATTA CGATTATCAC TAGGAACTCT ATAAAAATCA 5400 

CCCAAATCAA TTGCATTTGC GCACTCTTOG TTAGTTAATA GTGTTTCATA CCTTTTTTCT 5460 

CCGTGTCTAA TACCTATAAT CTTAATATCT TGTTCTGAGG CAAAAATTTC TGATACAGCC 5520 

TTAGCCAACA CTTCAATCGT ACATGCTGGT GCTTTCTGAA CTAGTATATC TCCAGATTTC 5580 

CCTTCTTCAA ATGCAAATAA AACCAAGTCT ACTGCTTCTT CCAATGTCAT CACAAAACGT 5640 

GTCATGCTAG GTTCAGTAAT TGTAAGAGCA TTTCCTTGCT TAATTTGCTC AATCCAAAGA 5700 



WO 98/18931 



PCTYUS97/19588 



351 



GGAACGACAG 


ATCCACGGCT 


ACACAGAACA TTCCCATAGC GAGTCACACA 


TATCTTTGTA 


5760 


TGCTCAGGAT 


TTACCGTCCT 


GGACTTAGCA ACAGCAATCT TTTCCATCAT 


AGCCTTGGAT 


5820 


GTTCCCATAG 


CATTGACAGG 


ATAAGCCGCC TTATCTGTAG AAAGACAGAT 


AACTTGCTTT 


5880 


ACACCAGCTT 


CGATAGCCGC 


AGTGAGGACA TTCTCCGTTG CCAAAATGTT 


AGTTTTTACC 


5940 


GCTTCTACAG 


GGAAAAATTC 


ACAAGAAGGT ACTTGTTTAA GAGCAGCAGC GTGAAAAACA 


6000 


TAATCCACAC 


CATGCATAGC 


ATTTTTTACC GAAGCTAAGT CACGCACATC 


TCCAAGGTAA 


6060 


AAACGGATTT 


TCCCAGCCAC 


TTCTGGTACT TTTACCTGAA ACTCATGACG 


CATATCATCT 


6120 


TGTTTCTTTT 


CATCTCGCGA 


AAATATACGA ATCTCTGAGA CATCTGTTTC 


TAAAAAACGC 


6180 


TTGAGAACCG 


CATTCCCAAA 


TGAACCTGTC CCTCCTGTAA TTAGGAGAGT 


TTTTCCTGTA 


6240 


AATTGTGACA 


TATATTACAC 


TTCTCCTTCT AGTATGTCTG CAATTTTCTT 


ACAAGCCGTT 


6300 


CCATCTCCAT 


ATGGATTTGA 


AGCTTGACTC ATTGCTTGAT AAACTGAATC 


ATTTTCTAAT 


6360 


AATTCTTTAA AATGCCTATA AATATTATTT TCATCAGCAC CTACAAGTTT 


CAAAGTCCCT 


6420 


GCTTCAATTC 


CCTCTGGACG 


TTCAGTTGTA TCTCTCATAA CCAAAACAGG 


TTTTCCTAAA 


6480 


CTTGGAGCCT 


CTTCCTGAAT 


ACCACCACTA TCTGTTAAAA TTAAATAACT 


TCTTGATAAA 


6540 


AAATTGTGAA AATCTAATAC 


TTCTAAAGGT TCGATCATCT TGATACGTTC 


ACAGCCACTT 


6600 


AGTTCTTCCT CAGCAATTTG GCGAACACGA GGATTCATAT GGATAGGATA AATAGCCTTG 


6660 


ACATCTGAAT 


ATTCTTCAAT 


AATCCTTCTA ATTGCTCTAA ACATATGTCT 


CATCGGTTCA 


6720 


CCAAGATTTT 


CACGACGATG 


AGCTGTAATT AGAATAAACC TGCTTTCTCC 


TATCCATTCT 


6780 


AACTCAGGAT GCGTATAGTC 


CTCTTGAATT GTAGTTTGTA AAGCATCAAT 


CGCCGTATTA 


6840 


CCTGTCACAA ATATGCTCTC 


TGGAGTTTTT CCTTCTCTTA AAAGATTATC 


TTTTGAAAGT 


6900 


TGTGTTGGTG TAAAATGATA CTGAGCCAAA ACCCCAACTG CTTGACGATT AAACTCTTCA 


6960 


GGATATGGTG 


AATAGATATC 


GTAAGTGCGC AAACCAGCTT CAACATGACC 


AATTGGAATC , 


7020 


TGTAAATAAA AGGCCGCCAG 


TGAACTAGCG AAGGTCGTAC TTGTATCCCC 


ATGAACTAAC 


7080 


ACCAAATCAG GTTTTTCTGA CTCTAAAATA GCCTTCATTC CTTCCAAAAT GCCAATGGTC 


7140 


ACATCAAATA AAGTTTGTTT 


ATCTTTCATA ATAGACAAAT CAAAATCGGG 


AATAATCCCA 


7200 


AATGTGTCCA AGACCTGATC CAACATTTGA CGGTGTTGGC CCGTAACGCA AACTAATGTT 


7260 


TCAATATTCT 


TACGTGTTCT 


TAACTCTTTG ACCAAAGGAC ACATCTTGAT 


GGCTTCTGGA 


7320 


CGAGTTCCAA ATACTACAAC 


TACTTTTTTC ATATATTTAC TTACTCCTAA 


CAAATAATGA 


7380 


ACGGTTCTTA 


AAATAAATTA 


GATAACGGCT AATCCATAAC ACCACCTCAG 


ACATACTTGA 


7440 
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ACAAATAGCT AATGTTACTA AACTAAAATT ATCAGACAAG ATAAATATTC CTAATCCCAA 7500 

AGTTTGGACA ATCGAAGCTA ATATAGTTGT CATTGTAGTT TCTTTCACTT TATCAATAGC 7560 

TCCTAAGACA GGCCATCCGT AAATCATAGA ATAAAAACTA GCAACAAAAG CGGGTAATAA 7620 

GTACTTAAGA AAATCTGCTG AAACGGTATA TTTTTCACCA CCAATTATAG AAAGAATTTG 7680 

ATTTGAAAAG AATAAAACTA TCAAAACTCC AAAGATAATA GGAATAAACA TAATCCGATT 7740 

AATACTCTTA ACCGATTGTA TATCTTTAGT ACGTATCATA TGCGGATATA AACTATTCGC 7800 

TATAGGATTA TACAATGATT TTGCTGCTGA AAGCAGTTGC ATTGCTATCC CCCAAAAGGC 7860 

TATCTCTTGA CTTTGTAAAT AAAAACCCGA AATGACTGTC . GTAAAGACGC CAAAAATAGT 7920 

AGTTGCAAAA TTGGATAAAA AATAAATAGA GGATTCCTTT AAATCTTTAA CCCAAACAGA 7980 

CAGATAAGAA AATGATAATT TAATTCCATA ATAATGAAGG AATCTATAAG AAACTACTGC 8040 

AGCAACTAAA TTCCCAATTC CTTCCAATAT AGGAATCCAT AAAATAGAAG AATCATCTTT 8100 

TACTACAATA AATGTCAAAA TTGTAATGAT AGTTTTAGAA ATAATATAAG GAATTGCAAC 8160 

TGCATGCATC TTTTCAATTC CACGAAATAA AAAGTCAAAG ATAAAAATAT TGGTCACTGT 8220 

AGCTAACAAA TAAAAAACTG AAAAAAGAAT ATTCTCTCTC ATTATTGGGA TTTGCCACAT 8280 

CAATATGGTG TAAATTAGAA TCGAAATGAT AGATAAAAAT ATTTTTTCAA CTAGAGTATC 8340 

TCCAACTATC CTTCCAATCT TTGAGGGAGT AGTACAAGCA TTTACAATAT TTTTTGTAGC 8400 

TGATATCATG AAACCAAAAT CAATCACCAG TTGAACATAA GCTATTAACG CTTTAACATA 8460 

AATAACCATT CCATACGCGT CTAGCGAAAG CACCCTTGTC AAATACGGGA GTGTTAATAA 8520 

AGGAAATAGT AATTTAACAA TATTCAGAAT ATAGAGAGAA CTTGTATTTT TTATAAATGA 8580 

AATTCTATCA ACTTTCACGA ACTAGTCCTT CCAAAAAAAG ATCTAAATAG TCCAAACTAC 8640 

TTCTCGCTTT CAACACCAAT TCTGAAGGTA TTGTTATCGG TTTTAGATGA AAAGTTTCAA 8700 

GTTTCTTTAC AATACTATTA ACACTTGAAT CAAATAAAGA TTCACAACGT TGTAACTCTC 8760 

CAATTGCTCC ATAATAACGT GCTGTTTTTT CTGGATGGCA TGCAATGGCA ATCACAGAT V 8820 

TATTAAAACA TGTTGCCACT ACCCCAACAT GTAATTTACA AGTTAAAACC ACATGTACCA 0880 

TTTTCAACAA TGATGTCATT TCTGCAGGAG AATGATACTT GAATTGAAAA CAATCCTCAG 8940 

TTCTAACTAA TTTTCTAAAT TCCTGATAAT AAGCATCTTC ATAAGGTAGA ATGGAATCCG 9000 

AAGTTACTAC AACATAATAG TTAGGATTGT TTTCTAGAAA AAGACTAATT GATTCCGCAA 9060 

ATTTTTCAAG AGCTTTTTTG GAATGATTAT AGTGAACAAG AATTATCTTC TTATCTTTAG 9120 

CTTCTCTTTT CAATTGACAC AGCTGCTCTG TTTTTTCTTC TCTTAATTTA CTTGAAATAA 9180 

TTAAATCAAA GGTTTCATGC ACTGGAGCCG AAGGCGACAA ATGCTTCAAA GAATCAAATG 9240 
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ATTCTCGATC ACGAACTGTA ATAAATTGAG CATGATTAAT AATTCTCTTT ATACCATAAT 9300 

TCATCAAAGA ATCGTTATTA GGCCCTGCAC CAATACCTAA TACTCCTATA GGCTTTTTAA 9360 

AATATGAAGC CCAAATTCCC AAAGGTAAAA ATCGTTTAAA TTGGATTAAA TTATCACGAA 9420 

AACGTGCATT ATGCCCTTCC CCAAAATATC CTCCCGGGAT ATACAAAATA GCATCTGCTT 9480 

GTTTTTTAGT AAAACTTTGT TTTTGGCGAT ATTCTTTCAA GTACATTTGA AAGAAATCTG 9540 

ATGGATTATA AAAAGAAACT TCATATCCTT TAGATTCTAA TAAATCATAG ACAATCTCAC 9600 

CGTAAAGATA ATCACCGTAA TTACTTGAAC CATAATCCGT TGCACCATGT AACATAATTT 9660 

TTTTCACCAC TATTTTTTCA ACCTCCTAAA AATAAATATC ATAATCAAAG TATACATAAT 9720 

AGGACGATAA ACATCTATTG AACTACTTCT CACTAAAAGC AATAGTTGAG AAATTACCGA 9780 

AAAATAAATA ACTTTTGAGA TTTTACTTGT TTGAAAAGCT CTGAAATTTA ATCGCCATCC 9840 

ACTAAATATT CCCAAAACAA AACTCCAAAA AACACCACCA TAGTAACCAA AGTTCCAAAA 9900 

TAATTCTTCC ACAAAAGAAG AGCCTACAGG TAACCCCAAA AATTTATTAA TAACAACCGT 9960 

CGCTGATGCT TTATCAAAAA AATCACCAAC TAACCATCCA ATAGGAAAAA TTGATAGGAT 10020 

AGTGCGTAGA AATGTCATCC CATATTCATA TGGAATGCTA CTAGGCACAA CAGTTACAGC 10080 

AGAAGCTACT GTTAGGCTGG TCAGTCCCGA CTCTGAAAAT ACTTCCCCTA GTATATTCTT 10140 

TACAAAATCT AATGAAGAAA AGGAATCAAA TAAGTATATA CCTATAGTAT TCAAGTCGAA 10200 

ACGGTGCCCC CTAATAACAA CTAATACATT TAATAGAAAT ACAGTTACTA TTAAAAATAC 10260 

AAGTACTCTT TTCTTCGAAA AAGTAATCCC TAAAGATTGT GTGTATACTA AAACCAACGC 10320 

CAAGATTGAA AACACCTGGA TTTTACGACT TCCTGTTAGG ATCATTATCA AAATTAGGTA 10380 

AAACAACATT ACCCAAAAAA TAGTACGCTT TATAACTCGG GACAGCTTAT CTGAATAAAA 10440 

CAAGGAGAAC ACACCAGGAA GCATAAGTAC TCCTAAATCA TCTATTATTC CTGAACTAGC 10500 

TGCCTCTGAA TATGCTGAAT AGCTATTCGC CGCTCTAACT GCTAGTACTG TTTTAGAATC 10560 

AGTTATTACC CTAGAAATAA AGCCCACTCC TGTTAAAATC CTACGCGCAT TGTACAAAAT 10620 

TTTCTCTTCA TTTTCCTGAT AATTTTGTAC TTCTGAATGA TAATGTACCT TTCCATCACT 10680 

ATAAAAAAAT AAATAGCCTA CAGAATAACA AAACAAAATC CAAATTATAA AAATATATGA 10740 

ATGAAATAAT TCTTCATTAT TATAGAAGTT ACTAGGGCTC CACAGCAGAG TTGTTTGAAA 10800 

CCCCATATAC TCATTGAAAA TTAATCCAAA CATAAAAAAA TAAGATAAAA TCAGATACCA 10860 

TACAGAAAAA TCATATATAC TAACTTTTTG TAAAATAAAA CCAGTAATTT GAAAAATAAT 10920 

TAGAAAGCAA ACCCATATAA ATATAGACGG AACATAATTA GATATAAGAA AACCATTATT 10980 
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CCAATTATCG AGAGTCCAGA ACAAGTAACA GAAAGCAAAT ATAAAACTTA ATGTCACTAG 11040 
TGTCACTCTA CAAATATACT TTGTCTGCAT CTATATCTCC TTTATTACAC ACATTTCTTG 11100 
ATAACGATTC AATAATTTAC TAGCTTGATA ACAAATATCA TAGAGTCCAT CTGTCATACT 11160 
GTTATTTATT TCAAAACGAT TGCATTCCTC AGATGTTAAA GACAGTACTT TATCTTTCCA 11220 
TAGCAACACA GACTCTTCGT TGATAGGTAA GTAACTAATG TTTTTGGTCA CATCTACTTC 11280 

TTGCGTCACT GTATCTGACG ATAAAATTTG TAATCCCGAT GCCTGAGCCT CTACTAGAGA 11340 

AACAGGCAAC CCCTCATATT TAGACGGAAG CAAAAAAACA TCCATCGCAG ATAATAAATC 11400 

AGAAATATCA GTCCTTCTCC CTAAAAATAG CACATATGGG GTCAGATTTA GTTCTAAAGC 11460 

TTTCTGTTTT AATTTCTGCT CATCCTCACC ATTACCAACT AGGAGTAAAA TAACATTTGG 11520 

TTTGATTAAA ATGAGTTCTT TTAAAACGTT AAATAAATAA CTTTGGTTTT TTTGATCTGA 11580 

TAGGCGAGCT ATATTTCCTA ATACGAACTT ATTTGACACA TCTAATTCTC TACGACATTT 11640 

TTCTCTAACA TCTGACAAAA ATTGATACTT TTTCAAATCA ATTGCATTAA AAATAATTTC 11700 

AATTTTTCCG TCTTTATACG CTTTCTCTCC ATATAACCAC TTAGCCGAAT CTTCCCCACA 11760 

TGCAAACCAA TGAGTTGCTA AGATTTTTAC CAAAATTGTT ACTAATTTAC GCAATACTTT 11820 

TTGAAAACTG TTTTCTGTTA CATAAGCCAT ATGACTATGA ATAATTCTAA TTTTACAACC 11880 

AATTATTTTA GATAAGATCA GACCAATTGC AGATTTATAG CCATGGCAAT GAACTATATC 11940 

ATAATCTCCT TTCTTTATTA TTCTAGCAAG AGAGAGAAAC TGATGTAGAG GCTTTTTCCT 12000 

TAATAGAGGC ACATGATAAA CCTTTGCACC CAATTCTTTC ATTTTATCCT CTAAAAATCC 12060 

TTGTTCTTTT CCAGGCACAA TAAAATCAAA TTGAATTTTT TTTCTATCAA TGTGAGAATA 12120 

ATAGTTGAAT AGAAAACTTT CTACTCCACC ACTATCTAGT GTTGTAAATA GATGTAATAC 12180 

TTTAATCATT CTTCTTCCTT AAGCTTAAGA TTCGCTTCTC TAATTCTATT TCTGTTTTTT 12240 

GTTTTTCTAA ACTAATTCTG TCCATGAAGT TATCACAATT CTTAATTAGC TGTTTCCTGT 12300 

CAAGGTTTTG AATATACAAA GCCAAACAAT CTTTTTCCGA TTCATCCTTC ATAGGTAAAA 12360 

CGAAACCAAA ACCATTCTCT ATTGACACTT TTTCCATATA AGTATCTTCA CAAACTAAAA 12420 

TAGGTTTATA CAACAATGCA GCAAAGTAGA GTTTATTAGA CAAAGCATAG TCTAGTAAGG 12480 

GAGTGTGATT CCCGTATAAA TTCAAAACAA CATCTGTATT CTTATAAAAA GACATGGTAT 12540 

CTTTAGGCTG GAATGTGTCC ACCAAGTTAA CATTGCTGAT ATTTTTTTCT TGACAAAATT 12600 

CCCTTAATTC TCCTGCATTA GTACCTATAA AATTCAACTG AAATCGACTG TCATTTGCAA 12660 

AAAAATCGAT TATTTTTTTA TTTTGTTCTT GAAAACGAAT TAAACCAATG TAGGAAAGTT 12720 

GAATTGGAAA CGTACTATTA TTTTTTAACT GCTTTACCTC GTTTAATTCT ATCATATTGG 12780 
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GTAGGTTATG GGTAGTAAAA 


TACTCTCCCA 


TTGGTAAAAA 


AAATTTATAG 


CCGTCTGAAG 


12840 


AAACGATATT 


CATTAAAGAA 


TTTTTCACCA 


ATTGTTTCTG 


AACCAAACGA 


TAAACCAAAA 


12900 


ATTTTTCATA 


ACTGTAATCA 


CGAATATCAT 


AAATATATCT 


ATTTTTAAAT 


GAAAAGAGAA 


12960 


GAAAATCTAC 


TAAAATGAAA 


GACACAATAC 


TATGTAACGG 


CAATATCATA 


TCATAATCAT 


13020. 


TTTCTTTTAG 


CTTCTTTTTA 


ATTTCTTTTC 


TGAATTTTAC 


ATAACCTAAT 


ATCTTACTTA 


13080 


ATTTTCCTTT 


ACCAGAAAAA 


GAAATACGAT 


AGTAGTTTTG 


TTTTGTAATA 


ATCTCGTTAA 


13140 


TATTCTTATC 


CCAATATATA 


ACATCGTAAC 


TAATAGACAG 


TTTCTTCAAT 


AATTCTTTAT 


13200 


AAAAATTGAA 


GTAAGGAGTT 


AGAT AT AT AT 


TATCAGATAG 


TATAAACAGT ACTCTCATTA 


13260 


AATTATTCTT 


TCTTACTTTC 


CCTCTCTAAA 


CATGTCTCCA 


GTTCGAGCAT 


AAACTGCTCT 


13320 


TTTGAAAAGT 


GATTTTCATA 


GTAACAACGA 


GCTTTCTTTC 


CTAACTCTCT 


TTGTCTCTTA 


13380 


ATAGATAACA 


TACTAAATTT 


ACAAATATTT 


TTTGCCAATT 


GTTTTACATC 


TCGTTCGGGA 


13440 


CTAACATATC 


CACAATTTGC 


TTCTTCTACA 


ATTATTTTAG 


CATGTCCTGA 


AATTGCACCT 


13500 


ATAATTGGTT 


TGCCTGCCGC 


CATATAAGAk 


TGTACCTTCC 


CAGGTATAGT 


ACGAGAAACT 


13560 


ATCGAGTCTC 


CTATTAAAGA 


AACTAACATA 


GCATCTGATT 


TTTTATAGAA 


GGATGGCATT 


13620 


TCCTCCAAAG 


AACGTCTTCC 


ATAGAAGGAA 


ATATTCTTTA 


ACTCCAATTC 


ATGAGCTAAT 


13680 


GCTTTCATGC 


TTAACAATTC 


CGTACCATCT 


CCAACAAAAT 


GAAAATGAAT 


TTTCTTGGGT 


13740 


AAATTGGTAT 


TCTTCTCTAT 


CAAACTGGCA 


GCTTTCAAAA 


TAGTTTCCAA 


ATTTTGTGCT 


13800 


TTGCCAATAT 


TACCAGCAAA 


AGTTAGGTCA 


ACACTTTCTT 


TATTAACTAT 


AGATTCATCA 


13860 


GGGATAAAAA 


GATCTTCTGC 


ATATTGTGGC 


AAATATGTAA 


TCTTTTGTTC 


GGATATGTCA 


13920 


AATTGCTTCA 


CAAAATAATT 


TTTAAATGAT 


GGACTAGTGA 


CAAATATATA 


ATCACTAGCT 


13980 


CGGTAAACTT 


TTTTTGAGAT 


AAATTTAAAC 


AGCTTGAAAA 


TCAAGCCATC 


TTGTTTCACT 


14040 


CCACCTACGG 


TTAAACTATC 


TGGCCAAACA 


TCCATACAAT 


ATAGAAACAT 


CGGTTTCTTA 


14100 


TATTTriTIT 


TATAAGCCAT ACCAGCCCAT GCCATCATAA CTGGAGACAA TTGGTTAACG 


14160 


AATACACAGT 


CAAAATTCGA 


TCCATCTTTC 


GTTTTATACC 


TCCCCAATAA AACTCCTAAA 


14220 


GTAGAACTAA TTGCAAAGCT AAAATAATTC AACAATCGAA ATACAACACT TTTTTTTCTA 


14280 


GGGATTGTAT 


AAGAACGATA TATCGTAACA CCTTCTATAA TCTCACGTCT TTTTTTATTA 


14340 


TGACGATAAT 


CTGCATATAT 


CTTCCCTTCA 


GGGTAATTAG 


GAATCCCAGC 


CAAAACAGAG 


14400 


ACTTCATGCC 


CTTTTCGAAC 


TAAATCTTCA 


CAAATATCTG 


ACAACGTGAA 


TGGTTCTGGC 


14460 


TTATAATGTT 


GGCAAACAAA 


TAGTATTTTC 


ATTGTCCAAT 


TTAACTTTCT 


TTCTTACCAC 


14520 
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TACCCTCTAC 


AATACCTTTT 


CGTTTCAGTA 


CGTAAGGTAT 


TGTCTTAACT 


ATACATCTAA 


14580 


TATCCATTAT 


CAAAGACAGA 


TGTTTAACAT 


AGTAGCCATC 


TAACTCCGTC 


TTCATCTCAA 


14640 


CAGACAAAGT 


ATCACGCCCG 


TTAATTTGTG 


CCCATCCAGT 


TAACCCTGGC 


AAGATATCAT 


14700 


TTGCTCCATA 


CTTATCTCTC 


TCTGCAATCA 


AATCTAGTTC 


ATTTATACCC 


GCTGGTCTAG 


14760 


GACCTACAAT 


ACTCATATTA 


CCAACAAGAA 


TATTAAACAA 


TTGTGGTAGT 


TCATCCAAAG 


14620 


ATGTTTTTCG 


CAAGAAAGPP 


CCTACTTTTG TAATCyATTG CTCTGGATTA TATAAGTTTC 


14880 


GAGGCGCCAC 


*» X X X A X /»VJVJ X 


GCATCTATTT 


TCATAGACCT 


AAATTTCAAA 


A T A T* A ^ A TV fm 
Al Al AVaAALti 


14940 


ATTCTTTATG 


A at app & a art 


GGTTTTTGCT 


TAAATATAAC 


CGGACCTTCT 


bAAl LAAun 


15000 


TAATCfiPAAT 


TflP A ATTa'TV^ 


ATAAAAACCG 


GACACAATAT 


TATTATCCCT 


A IT AAAGAT A 


15060 


ATAATATATP 


a ppt a ATrvT 


TTTATTATAC 


CGTACATAAA CAACCTCCAA 


<T AT 1 A A A fflrfvi 


15120 


T A TTTPP ji rpm 


1 1 ILAi 1\.1A 


TTTCCATTTG 


ACAAATTAAA 


TCAGGCAGTA 


<_ATvJCAACTA 


15180 


CACJAAAPTPA 


ATATATATTT 


GGTCACTCAA 


TGATTTTCAG 


AAATATAATT 


CTTTTATCCT 


15240 


CTACGTPAC3A 




CTCCATCTAA 


ACAAAATTTA 


TTTGTTTCAG 


XAAlAlAlwt 


15300 


GTTPTPAATA 




AGGTCCAGTT 


CAATTATTCT 


TCCAAATAGA 


CCGAATATTA 


15360 


TTTfiAArtAPA 


TATCGGTTTC 


TGAAATTGCA 


ATCAGTACAT 


AAGCTAATAA 


ACTGATAAGT 


15420 


ATGCTCTGTA 


AfiAATfiPPArt 


AGTTATATTG 


TAGTCCCCTT 


CCATACTATA 


TTCATTTTAT 


15480 


TTTT T A PP AT 


A A TTnTC A T A 
nAl 1 1LLA1A 


GGAACCGTAA 


ACTCCATACT 


TATTAACCGA 


GATATCCAAT 


15540 


TTATTTAAAA 


PAAPTPPTAn 


GAACAGTTTC 


CCTGTTTGTT 


TTAATTGTTG 


TTTCGCTTTT 


15600 


TGGATATCAC 


X X A A X X l«UV> 


CTCACCTGTT 


GCTGTTACCA 


AGATGGACGC 


ATCACACTTT 


15660 


TGAGTGATAA 


TTGCCGCATC 


AATAACAATT 


CCAATAGGCG 


GTGTATCAAT 


AATGATATAA 


15720 


TCAAAATATT 


TACGCAATGT 


TTCAATCATA 


TCATTAAAAT 


TTTTACTTTG 


TAACAAGGCT 


15780 


GTAGGGTTTG 


GTGATACAGA 


TCCCGATTGA 


ACTACAAATA 


AATTTTCAAT 


ATTTGTATCA 


15840 


CATAAACCGT 


GAGATAAATC 


AGCTGTCCCA 


GATAAAAATT 


CTGTTAGCCC 


TGTAATTT IT 


15900 


TCACGAGATT 


TAAAAACTCC 


TAACATAACT 


GAATTTCGAG 


TATCGCCATC 


GATCAAAAGA 


.15960 


GTTTTATAGC 


CTGCACGCGC 


AAACGACCAT 


GCTATATTTA 


TGGAAGTAGT 


TGTTTTTCCT 


16020 


TCCCCAGGGT 


TAACAGAAGT 


AACGGAAATT 


ACTTTTAGTT 


TATCTCCGCT 


CAACTGTATA 


16080 


TTTGTACACA 


AGGCATTGTA 


ATATTCTTCT 


GCCTTCTTAA 


TGAACTCCAG 


TTTTTTTTGT 


16140 


GCTATTTCTA 


ATGTCGGCAT CCTTCTCTCC TATTTCAACT TACCCAAGTT TGGCACAACt 


16200 


CCCAAAAGTG 


TCATCTGCAA 


TGTATTTTCG 


ATATCTTCCG 


GACGTTTCAC 


ACGAGTATCC 


16260 


AAAAGTTCAA GATGAAGAAC 


TATAACACTA 


GTTCCAATCA 


CCCCTGCCAA 


AAAACCAATT 


16320 
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AGTGTATTGC GTTTAATATT TGGCGAAGAC 


GGGGATATCG 


CCGGCCTTGC 


CTCCTCCAGT 


16380 


GTTGTCACGT CAGAAACACG AGTAATACTG ATAATTTTTT GAGCAGCTAC TTCTCTCAAA 


16440 


GAGTTAGCGA TACGGCTTGC CTCTTCAGGA 


ACTCGATCAT 


TAACTGAAAT 


AGAGACAATA 


16500 


CGGGTATCAA CTGGTACTGT CACTTTAATT 


TTATTAGCCA 


AACCTTTTGG 


CGTCAAATCT 


16560 


AGTTTCAAAT CAGAAACAAC TTCCTCCAAA 


ACATCCTGCG 


AAAGGATAAT 


CTCACGGTAG 


16620 


TCTTTTACCA GATAAGTTCC TGCCTGCAAA 


TCCTGATTTG 


TCAACCCCGG 


CTTGTCTCCT 


16680 


TGATTGCGAT TCACTACGTA AATTCGCGTG 


GTACTCGTAT 


ATTCTGGCTT 


AACAATAAAA 


16740 


GTGCTATATG CAAAAGCCCC CGCACCTGTC 


ACAAGTGCCA 


CTATTAAAAT 


CATTAGCTTG 


16600 


CGTTTCCACA AGCTTTTAAC TAATTGAAAT 


ACATCGATTT 


CTATCGTATT 


TTGTTCTTTC 


16860 


ATCATTTCTC CTAAATTAGT TGATCCATTA 


CAATTTTTCG 


AGGATTGTCT 


ATAAAAAGTT 


16920 


CCTGAGCCTT CGCTTCTCCG TATTTTTGGG TAACAAGGTC ATATGCTTCT GCCATATGAG 


16980 


GAGGTCTACC GTCTAGATTG TGCATATCAC 


TTGCAATGAC 


ATGAACCAAA 


TCCTGCTCTA 


17040 


AAAAATACTG AGCTCTTTTT TTCATGAATT 


TATAACGTTC 


GCCAAAAAGT 


TTGGGTTTGA 


17100 


GGACATGTGA ACTATTTACT TGCGTGTAAC 


AGCCCATATC 


GATCAGTTCT 


CGAACGCGTT 


17160 


TTTCATTATT TTCAAGAGCA TCATAGCGCT 


CAATGTGGGC 


AATGACTGGA 


GTAATTCCCA 


17220 


ACATCAAGAT CTTGCTCAAG GCGCTATGAA 


TATCGCGATA 


AGGAGTGTTC 


ATACTAAACT 


17280 


CTATCAAGGC ATAACGACTA TCATTGAGGG 


TCGGAATCCG 


CTTTTTTTCC 


AGCTTATCCA 


17340 


GAACATCTGG TGTGTAATAA ATTTCAGCCC 


CGTAAGCAAT 


GACCAAGTCA 


CTCGCCACTT 


17400 


CCTTAGCTAT TTCCCGAACC TGAAGAAAGT 


TTTCTGCTAT 


CTTCTCTTCC 


GGAGTTTCAA 


17460 


ACATGCCCTT GCGACGGTGA GAGGTAGAAA 


CAATGGTTCG 


CACCCCCTGT 


CTGTAGGATT 


17520 


CTGCCAAGAG AGCCTTGCTT TCCTCTCTTG ACTTGGGACC GTCATCTACA TCAAAAACGA 


17580 


TATGCGAATG GATGTCTATC ATTTCATCTA 


CCCTCCATCA 


CATCCTGTAT 


AGCTGCTTTA 


17640 


ACTACAGCTA AACTACTATC ATCTATTTCC 


ATCACATAGA 


GGTTACTGTC 


TGGCATTGCA 


17700 


TAAGAAGGAA GATCCATCCG ACCTGTCCCT TTTAAATCTT GAGAATTTAC 


TTTATAATTC 


17760 


CCTCCACTTT CTAACTGAGC ATTGACCAAA TTTATCATGG 


TCTCAAGTGG 


CATATTTGTT 


17820 


TGGATAGAAT CTTGCAAGCT ATTAATGATC 


GTACTATAAT 


TTTTCAGCAC 


TTCGGTTGAC 


17880 


GTTAATTTTT GAAGGATAGC CACAATCACC 


TTTTGTTGAT 


GGCGCCCGCG 


GTCACGATCG 


17940 


CCATCTGCTA GGGAGTAGCG CTCACGAACA 


AAACCGAGAG 


CCTGTTCTGA 


ATCAAGATGA 


18000 


ACATTGCCTG CAGGGTAATA CTTTCCATTC 


GTATGGGCAG 


TAAATTCTTG 


ATCATTATAA 


18060 
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ACATCAATTC CACCCAACAA ATCAATCAAT TTCAAAAACG AAGTGAAGTT CAATCGCACA 

TAGTAATTGA TATCCACTCC ATAGAGATTT TCTAAGGTGT GAATGGACGA ATCAACTCCA 

TAAATGCCCG CATGAGTCAA TTTATCTTTT TGATTATTTC CACCATCTGC GATTGGTACA 

TAGGCATCAC GTGGCGTTGT GGTCAAGAGG ATTTTCTTGG TATCTCGATT GACAGTCATC 

AGGATGTTGA CATCTGATCG CGACACCGAA CTAATAGGAC CATAGGTGTC AATTCCACTA 

ACATAGATAT TGAAAGACTG ACTCTTAGAC GTCTTAGGAG CTTCTACTTT TTTAGTGAAT 

CCCTTAGTAT AAATCTTTTT TATCTTCGAT GCGTAGTCTG GATACTCTGA CTCGATGATG 

TTTTCAAAGA CACTATTTAG GACAATGGCC TTAGTCTCCC CTGCAATCAA ACTCTTGTAA 

GCTGCCAAGT AAGACGAACT CTGGTTGACC GTCAAATCGG TATTCTGACT TGACTTGATA 

TCAGCTAGTA ATTTCTGAAT ATTTTCATTA TTAGTCCCAG TCGGTGCTGT CACACTCGTC 

AGTTGCGTAA CATTTTCGAT CTCACTATCT GCTAAAACAG CGACACTGAT TGAATATTCT 

GAGTAATTAG AAGTCGCATT TAAACGATTG GTCAGTCCAA CAAACTGCTG TACTGCAAAG 

AGCGACACAG AGCTGACAAG GATAGAGAAC ACCAACAGAA AAATAGTAAA CTTTTCAGCT 

TTTTTATAGA TAATCAAGAG TAGCCCTACC AAGGCAACTA GTAGGACTAA CGCAGTTACC 

ACTAGATTAA GATATCTAAA AGCAAGGATA TTGTACTTAA AGATTAAGAA CAATAAAAAA 

CAAACTAACA ATAAATAAAT AGTCAGCAAA ACTATATTAA CACTTCGCTT CACTTTCTGT 

GAACGTGATT TTTTAAAACG TCTACTCATG ATTAATACCT ATACATTGAA CATTATACGA 

TTATATCACT TTTTTACGGT AATGTCTACA CCTTTATTTT TACTATCTGC ATCTTTAAGT 

ATCTTAGTAG ACTTCCCGCG AAACAAAAAT ATAGTAAAAT GAAATAAGAA CAGAACAAAT 

CGTTCAGGAC AGTCAAATCG ATTTCTAACA ATGTTTTAGA AGCAGAGGTG 

(2) INFORMATION FOR SEQ ID KO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21706 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



18120 

18180 

18240 

18300 

18360 

18420 

18480 

18540 

18600 

18660 

18720 

18780 

18840 

18900 

18960 

19020 

19080 

19140 

19200 

19250 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

AAAGTTGAAA GACTGCTAGC TGTTTTTGAT ACCAATCGTT TCCAACTACA GAGCAAACAG 60 

TATACAAAGT TTGTTTTTGG ATGTAAGCTT CTTGATGGAC AATTCCAAGA AAATCAAGAA 120 

ATTGCTGACC TTCAATTTTT TGCCATTGAC CAACTGCCGA ACTTATCTGA AAAACGCATT 180 

ACCAAGGAGC AAATAGAGCT TCTTTGGCAG GTTTATCAAG GTCATAGGGG GCAATATCTT 240 
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GACTAAGAAG ATGATTATCG TATTTCTAAA TCCATTTTTA ACAACTAGCA TGGTATAATA 300 

ATATGCAGGA AAATTTTGAA TTATGAGGAA GACTAGATGA ATTTATGGGA TATTTTCTTT 360 

ACGACTCAGG CAACCGAGCC GCCCAAATTT GACCTTTTTT GGTATGTTAG CCTATTTACG 420 

CTCTTAGCCT TAACCTTTTA TACAGCCCAT CGCTATCGTG AAAAGAAGGT TTACCAACGA 480 

TTTTTCCAAA TCTTGCAGAC TGTTCAGTTA ATCCTTCTTT ATGGTTGGTA CTGGGTCAAT 540 

CATATGCCAC TGTCAGAAAG CCTACCCTTT TACCATTGCC GTATGGCTAT GTTTGTGGTA 600 

CTCTTGCTTC CTGGTCAATC CAAATATAAA CAATACTTTG CATTATTGGG AACATTTGGG 660 

ACATTAGCAG CCTTTGTTTA TCCAGTGCCA GATGCTTACC CTTTTCCACA TATCACCATT 720 

CTATCCTTTA TCTTTGGTCA TTTAGCACTC TTGGGGAACT CTCTAGTTTA TCTATTGAGA 780 

CAGTATAATG CGCGATTGCT GGATGTGAAG GGAATTTTTC TCATGACCTT TGCCCTAAAT 840 

GCCTTGATTT TTGTGGTCAA TTTGGTGACA GGTGGCGATT ACGGATTTTT GACAAAACCG 900 

CCATTGGTTG GGGATCAGGG TCTAGTAGCT AATTATTTAC TTGTTTCAAT TGTGCTGGTA 960 

GCTACTATCA GTTTGACTAA GAAAATCTTA GAATTCTTTT TAGCTCAAGA AGCAGAAAAA 1020 

ATGATTGCAA AGGAAGCTTA ACACAGAGCT TTCTTTTTTG CTCTTAGAGA GTTTTTACAA 1080 

GC AG CTT AT A AAATAAGAAT TTCTGAATAG ACAAACTCAA AAAATGGCTG GGAAATTTAG 1140 

GAAAAAAGCA AGCACGATTA AATTTTTTGT GTTATAATAT TTTGTGAATA GCTATGCCTA 1200 

TGTTTAGCTA TGGAATAATA CGAAGTGCGA AACTTGGAAG ATAGAGAGGA AGCGATGTAA 1260 

TGGCTAGAGA AGGCTTTTTT ACAGGTCTAG ATATTGGAAC AAGCTCTGTC AAGGTGCTTG 1320 

TGGCCGAGCA GAGAAATGGT GAATTAAATG TAATTGGCGT GAGTAATGCC AAAAGTAAAG 1380 

GTGTAAAGGA TGGAATTATT GTTGATATTG ATGCAGCAGC AACTGCTATC AAGTCAGCCA 1440 

TTTCCCAAGC GGAAGAAAAG GCAGGCATTT CGATTAAATC AGTGAATGTC GGCTTGCCTG 1500 

GTAATCTTTT GCAGGTAGAA CCAACTCAGG GGATGATTCC AGTAAGATCT GATACTAAGG 1560 

AAATTACGGA TCAAGATGTT GAAAATGTTG TCAAATCAGC TTTGACAAAG AGTATGAGAG 1620 

CTGACCGTGA AGTCATTACC TTTATTCCTG AAGAATTTAT TGTGGATGGT TTCCAAGGGA 1680 

TTCGTGACCC ACGTGGCATG ATGGGGGTTC GCCTTGAAAT GCGTGGTTTG CTTTATACAG 1740 

GACCTCGTAC TATCTTGCAC AATTTGCGTA AGACGGTTGA GCGTGCAGGT GTTCAGGTTG 1800 

AAAATGTTAT CATTTCACCA CTAGCAATGG TTCAGTCTGT TTTGAACGAA GGGGAACGTG I860 

AATTTGGTGC TACAGTGATT GATATGGGGG CAGGTCAAAC GACTGTCGCT AGAATCCGTA 1920 

ATCAAGAACT CCAGTTCACA CATATTCTCC AAGAAGGTGG AGATTATGTA ACTAAAGATA 1980 
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TCTCCAAGGT TTTGAAAACC TCTCGCAAAT TAGCGGAAGG CTTGAAACTG AATTACGGGG 2040 
AAGCCTATCC GCCTCTTGCA AGCAAAGAAA CCTTCCAAGT AGAGGTTATT GGAGAAGTAG 2100 
AAGCAGTCGA AGTGACGGAA GCCTACTTGT CAGAAATTAT TTCTGCACGA ATCAAGCACA 2160 
TCCTTGAACA AATCAAGCAA GAATTAGATA GAAGGCGTCT ATTGGACCTC CCTGGTGGTA 2220 
TTGTCTTAAT CGGTGGGAAT GCCATTTTAC CAGGTATGGT TGAGCTTGCT CAGGAAGTCT 2280 
TTGGCGTCCG TGTCAAGCTT TATGTTCCAA ATCAAGTTGG TATCCGTAAT CCAGCCTTTG 2340 
CGCATGTGAT TAGTTTATCA GAATTTGCGG GTCAATTAAC AGAAGTTAAT CTTTTGGCTC 2400 
AGGGAGCGAT AAAAGGTGAG AATGACTTAA GTCATCAGCC AATTAGTTTT GGTGGGATGC 2460 
TGCAAAAAAC AGCTCAGTTT GTACAATCAA CGCCTGTTCA ACCAGCTCCT GCTCCAGAAG 2520 
TAGAGCCGGT GGCGCCTACA GAACCAATGG CGGATTTCCA ACAAGCTTCA CAAAATAAAC 2580 
CGAAATTAGC AGATCGTTTC CGTGGATTGA TCGGAAGCAT GTTTGACGAA TAAAGAGGAA 2640 

AAATAAATTA TGACATTTTC ATTTGATACA GCTGCTGCTC AAGGGGCAGT GATTAAAGTA 2700 

ATTGGTGTCG GTGGAGGTGG TGGCAATGCC ATCAACCGTA TGGTCGACGA AGGTGTTACA 2760 

GGCGTAGAAT TTATCGCAGC AAACACAGAT GTACAAGCAT TGAGTAGTAC AAAAGCTGAG 2820 

ACTGTTATTC AGTTGGGACC TAAATTGACT CGTGGTTTGG GTGCAGGAGG TCAACCTGAG 2880 

GTTGGTCGTA AAGCCGCTGA AGAAAGCGAA GAAACACTGA CGGAAGCTAT TAGTGGTGCC 2940 

GATATGGTCT TCATCACTGC TGGTATGGGA GGAGGCTCTG GAACTGGAGC TGCTCCTGTT 3000 

ATTGCTCGTA TCGCCAAAGA TTTAGGTGCG CTTACAGTTG GTGTTGTAAC ACGTCCCTTT 3060 

GGTTTTGAAG GAAGTAAGCG TGGACAATTT GCTGTAGAAG GAATCAATCA ACTTCGTGAG 3120 

CATGTAGACA CTCTATTGAT TATCTCAAAC AACAATTTGC TTGAAATTGT TGATAAGAAA 3180 

ACACCGCTTT TGGAGGCTCT TAG CG AAGCG GATAACGTTC TTCGTCAAGG TGTTCAAGGG 3240 

ATTACCGATT TGATTACCAA TCCAGGATTG ATTAACCTTG ACTTTGCCGA TGTGAAAACG 3300 

GTAATGGCAA ACAAAGGGAA TGCTCTTATG GGTATTGGTA TCGGTAGTGG AGAAGAACGT 3360 

GTGGTAGAAG CGGCACGTAA GGCAATCTAT TCACCACTTC TTGAAACAAC TATTGACGGT 3420 

GCTGAGGATG TTATCGTCAA CGTTACTGGT GGTCTTGACT TAACCTTGAT TGAGGCAGAA 3480 

GAGGCTTCAC AAATTGTGAA CCAGGCAGCA GGTCAAGGAG TGAACATCTG GCTCGGTACT 3540 

TCAATTGATG AAAGTATGCG TGATGAAATT CGTGTAACAG TTGTTGCAAC GGGTGTTCGT 3600 

CAAGACCGCG TAGAAAAGGT TGTGGCTCCA CAAGCTAGAT CTGCTACTAA CTACCGTGAG 3660 

ACAGTGAAAC CAGCTCATTC ACATGGCTTT GATCGTCATT TTGATATGGC AGAAACAGTT 3720 

GAATTGCCAA AACAAAATCC ACGTCGTTTG GAACCAACTC AGGCATCTGC TTTTGGTGAT 3780 
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TGGGATCTTC 


GCCGTGAATC 


GATTGTTCGT 


ACAACAGATT 


CAGTCGTTTC 


TCCAGTCGAG 


3840 


CGCTTTGAAG 


CCCCAATTTC 


ACAAGATGAA GATGAATTGG 


ATACACCTCC 


ATTTTTCAAA 


3900 


AATCGTTAAG 


TAAATGAATG 


TAAAAGAAAA 


TACAGAACTT 


GTTTTTCGAG 


AAGTTGCAGA 


3960 


GGCTAGTCTG 


AGTGCTCATC 


GAGAGAGTGG 


TTCGGTCTCT 


GTCATTGCAG 


TTACCAAGTA 


4020 


TGTAGATGTA 


CCGACAGCGG 


AAGCCTTGCT 


TCGGCTAGGT 


GTGCATCATA 


TCGGTGAAAA 


4080 


TCGTGTAGAT 


AAGTTTCTGG 


AAAAATATGA AGCTTTAAAA 


GATCGAGATG 


TGACTTGGCA 


4140 


TTTGATTGGT 


ACCTTGCAAA 


GACGTAAGGT 


GAAAGATGTC 


ATTCAATACG 


TTGATTATTT 


4200 


CCATGCATTG 


GACTCAGTAA 


AGCTAGCAGG 


GGAAATTCAA 


AAAAGAAGTG 


ACCGAGTGAT 


4260 


CAAGTGTTTC 


CTTCAAGTAA 


ATATTTCTAA AGAAGAAAGC 


AAACACGGTT 


TTTCGAGAGA 


4320 


GGAACTGCTG 


GAAATCTTGC 


CAGAGTTAGC CAGACTAGAT 


AAGATTGAAT 


ATGTTGGTTT 


4380 


AATGACGATG 


GCACCTTTTG 


AGGCTAGCAG 


TGAGCAGTTG 


AAAGAGATTT 


TCAAGGCGGC 


4440 


CCAAGATTTA 


CAAAGAGAAA 


TTCAAGAGAA 


ACAAATTCCA 


AATATGCCTA 


TGACCGAGTT 


4500 


AAGTATGGGA 


ATGAGTCGTG 


ATTATAAAGA 


AGCGATTCAA 


TTCGGTTCCA 


CTTTTGTTCG 


4560 


TATAGGTACA 


TCATTTTTTA 


AGTAGGAGAG 


AACCATGTCT 


TTAAAAGATA 


GATTCGATAG 


4620 


ATTTATAGAT 


TATTTTACGG 


AGGATGAGGA 


TTCAAGTCTC 


CCTTATGAAA 


AAAGAGATGA 


4680 


GCCTGTGTTT 


ACTTCAGTAA 


ATTCTTCACA 


GGAACCGGCT 


GTGGCAATGA 


ATCAACGTTC 


4740 


ACAGTCGGCT 


GGCACAAAAG 


AGAACAATAT 


CACCAGACTT 


CATGCAAGAC 


AACAGGAATT 


4800 


GGCAAATCAG 


AGTCAGCGTG 


CAACGGATAA 


GGTCATTATA 


GATGTTCGTT 


ATCCTAGAAA 


4860 


ATATGAGGAT 


GCAACAGAAA 


TTGTTGATTT 


ATTGGCAGGA 


AACGAAAGTA 


TGTTGATTGA 


4920 


TTTTCAGTAT 


ATGACAGAGG 


TGCAGGCTGG 


TCGTTGTTTG 


GACTATTTGG 


ATGGAGCTTG 


4980 


TCATGTTTTA 


GCTGGAAATT 


TGAAAAAGGT 


AGCTTCTACC 


ATGTATTTGT 


TGACACCAGT 


5040 


GAACGTTATT 


GTAAATGTTG 


AAGATATCGG 


TTTACCAGAT 


GAAGATGAAC 


AGGGTGAGTT 


5100 


CGGTTTTGAT 


ATGAAGCGAA 


ATAGAGTACG 


ATAATGATTT 


TTTTAATTCG 


TATGATTTAT 


5160 


AATGCAGTGG 


ATATTTACTC 


CCTGATTTTG 


GTAGCCTTCG 


CTGTCATGTC 


TTGGTTTCCA 


5220 


GGTGCCTACG 


AATCCAGTTT 


AGGTCGTTGG 


ATTGTAGCGT 


TGGTGAAACC 


AGTGCTTGGT 


5280 


CCCTTGCAAC 


GCCTGCCTTT 


ACAGATAGCG 


GGTCTTGATT 


TATCTGTTTG 


GGTTGCGATT 


5340 


GTTTTGGTTC 


GATTTTTAGG 


AGAAAACCTA 


GTGCGTTTTC 


TGGCGATGAT 


AGGATGAATA 


5400 


AAGGGATTTA 


TCAGCATTTC 


TCCATAGAAG 


ATGGTCCATT 


TCTTGACAAG 


GGAATGGAAT 


5460 


GGATAAAGAA 


GGTAGAAGAT 


AGCTATGCTC 


CTTTTTTAAC 


TCCTTTTATG 


AATCCTCATC 


5520 
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AGGAGAAGCT ATTAAAGATT TTGGCCAAAA CCTATGGTCT TGCTTGTAGC AGTAGTGGGG 5580 

AATTCGTCTC GAGTGAGTAT GTTCGAGTTT TATTATACCC AGATTATTTC CAACCAGAGT 5640 

TTTCAGATTT TGAAATATCT CTCCAGGAAA TTGTGTATTC CAATAAATTT GAACATTTAA 5700 

CGCATGCTAA GATTTTAGGG ACAGTCATCA ATCAATTAGG GATTGAACGG AAACTTTTTG 5760 

GAGATATCCT AGTAGATGAA GAACGGGCGC AGATTATGAT TAATCAGCAG TTTCTTCTTC 5820 

TCTTTCAAGA TGGACTAAAG AAAATTGGTC GTATACCTGT TTCGCTGGAG GAACGTCCTT 5880 

TCACCGAGAA AATAGATAAG CTAGAACAGT ATCGAGAACT GGATTTATCT GTGTCTAGTT 5940 

TTCGATTAGA TGTTCTTTTA TCAAATGTTT TGAAACTATC TAGGAATCAA GCAAACCAGT 6000 

TGATTGAAAA GAAACTTGTC CAAGTAAATT ATCATGTGGT AGACAAATCA GATTACACTG 6060 

TTCAAGTTGG AGACTTGATT AGTGTGAGAA AATTTGGTCG CTTGAGATTA CTTCAAGATA 6120 

AGGGACAAAC GAAAAAAGAG AAGAAAAAAA TAACCGTCCA GTTATTATTA AGTAAGTGAG 6180 

GAATAGAATG CCAATTACAT CATTAGAAAT AAAGGACAAG ACTTTTGGAA CTCGATTCAG 6240 

AGGTTTTGAT CCAGAAGAAG TCGATGAATT TTTAGATATT GTGGTTCGTG ATTACGAAGA 6300 

TCTTGTGCGT GCGAATCATG ATAAAAATTT GCGTATTAAG AGTTTAGAAG AGCGTTTGTC 6360 

TTACTTTGAT GAAATAAAAG ATTCATTGAG CCAGTCTGTA TTGATTGCTC AGGATACAGC 6420 

TGAGAGAGTG AAACAGGCGG CGCATGAACG TTCAAACAAT ATCATTCATC AAGCAGAGCA 6480 

AGATGCGCAA CGCTTGTTGG AAGAAGCTAA ATATAAGGCA AACGAGATTC TTCGTCAAGC 6540 

AACTGATAAT GCTAAGAAAG TCGCTGTTGA AACAGAAGAA TTGAAGAACA AGAGCCGTGT 6600 

CTTCCACCAA CGTCTCAAAT CTACAATTGA GAGTCAGTTG GCTATTGTTG AATCTTCAGA 6660 

TTGGGAAGAT ATTCTCCGTC CAACAGCTAC TTATCTTCAA ACCAGTGATG AAGCCTTTAA 6720 

AGAAGTGGTT AGCGAAGTAC TTGGAGAACC GATTCCAGCT CCAATTGAAG AAGAACCAAT 6780 

TGATATGACA CGTCAGTTCT CTCAAGCAGA AATGGCAGAA TTACAAGCTC GTATTGAGGT 6840 

AGCCGATAAA GAATTGTCTG AATTTGAAGC TCAGATTAAA CAGGAAGTGG AAGCTCCAAC 6900 

TCCTGTAGTG AGTCCTCAAG TTGAAGAAGA GCCTCTGCTC ATCCAGTTGG CCCAATGTAT 6960 

GAAGAACCAG AAGTAGCTCC AATGCATCCG ATAGGTCCAA CACCAGCTAC AGAAACTGTT 7020 

GATTCAATAC CGGGATTTGA AGCACCGCAA GAATCTGTTA CAATTTTATA AGAAATATTC 7080 

TGAGAACAAT ATCTTATCCT TATATTTCCA GCGAGCAGGA GATGGTGTGA GTCCTGTAAT 7140 

CCCTATTGAT AAGATTATCC TCTCAAAAAC TCAAGTCTGA AGCTAGTAAG ATTTGACGTT 7200 

TCCCACGTTA CGGGATAAGA GGGAGAAAGA CTAAATCTTT TTCCGAATAA AGGTGGTACC 7260 

ACGATTTTCG TCCTTTTTGG AAGTCGTGGT TTTTAATTTG TTATTATTTA TAAAGGAGAT 7320 
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ACCATGAAAC TCAAAGACAC CCTTAATCTT GGGAAAACTG AATTCCCAAT GCGTGCAGGC 7380 

CTTCCTACCA AAGAGCCAGT TTGGCAAAAG GAATGGGAAG ATGCAAAACT TTATCAACGT 7440 

CGTCAAGAAT TGAACCAAGG AAAACCTCAT TTCACCTTGC ATGATGGCCC TCCATACGCT 7500 

AACGGAAATA TCCACGTTGG ACATGCTATG AACAAGATTT CAAAAGATAT CATTGTTGGT 7560 

TCTAAGTCTA TGTCAGGATT TTACGCACCA TTTATTCCTG GTTGGGATAC TCATGGTCTG 7620 

CCAATCGAGG AAGTCTTGTC AAAACAAGGT GTCAAACGTA AAGAAATGGA CTTGGTTGAG 7680 

TACTTGAAAC TTTGCCGTGA GTACGCTCTT TCTCAAGTAG ATAAACAACG TGAAGATTTT 7740 

AAACGTTTGG GTGTTTCTGG TGACTGGGAA AATGCATATG TGACCTTGAC TCCTGACTAT 7800 

GAAGCAGCTC AAATTCGTGT ATTTGGTGAG ATGGCTAATA AGGGTTATAT CTACCGTGGT 7860 

GCTAAGCCAG TTTACTGGTC ATGGTCATCT GAGTCAGCAC TTGCTGAAGC AGAGATTGAA 7920 

TACCATGACT TGGTTTCAAC TTCCCTTTAC TATGCCAACA AGGTAAAAGA TGGCAAAGGA 7980 

GTTCTAGATA CAGATACTTA TATCGTTGTC TGGACAACGA CTGCATTTAC CATCACAGCT 8040 

TCTCGTGGTT TGACGGTTGG TGCAGATATT GATTACGTTT TGGTTCAACC TGCTGGTGAA 8100 

GCTCGTAAGT TTGTCGTTGC TGCTGAATTA TTGACTAGCT TGTCTGAGAA ATTTGGCTGG 8160 

GCTGATGTTC AAGTTTTGGA AACTTACCGT GGCCAAGAAC TCAACCACAT CGTAACAGAA 8220 

CACCCATGGG ATACAGCTGT AGAAGAGTTG GTAATTCTTG GTGACGACGT TACGAGTGAG 8280 

TCTGGTACAG GTATTGTCCA TACAGCCCCT GGTTTTGGTG AGGACGATTA CAATGTTGGT 8340 

ATTGCTAATA ATCTTGAAGT CGCAGTGACT GTTGATGAAC GTGGTATCAT GATGAAGAAT 8400 

GCTGGTCCTG AATTTGAAGG TCAATTCTAT GAAAAGGTAG TTCCAACTGT TATTGAAAAA 8460 

CTTGGTAACC TCCTTCTTGC CCAAGAAGAA ATCTCTCACT CATATCCATT TGACTGGCGT 8520 

ACTAAGAAAC CAATCATCTG GCGTGCAGTT CCACAATGGT TTGCCTCAGT TTCTAAATTC 8580 

CGTCAAGAAA TCTTGGACGA AATTGAAAAA GTGAAATTCC ACTCAGAATG GGGTAAAGTG 8640 

CGTCTTTACA ATATGATCCG TGACCGTGGT GACTGGGTTA TCTCTCGTCA ACGTGCTTGG 8700 

GGTGTTCCAC TTCCTATCTT CTACGCTGAA GATGGTACAG CTATCATGGT AGCTGAAACT 8760 

ATTGAACACG TAGCTCAACT TTTTGAAGAA TATGGTTCAA GCATTTGGTG GGAACGTGAT 8820 

GCCAAAGACC TCTTGCCAGA AGGATTTACT CATCCAGGTT CACCAAACGG CGAGTTCAAA 8880 

AAAGAAACTG ATATCATGGA CGTTTGGTTT GACTCAGGTT CATCATGGAA TGGAGTGGTG 8940 

GTAAACCGTC CTGAATTGAG TTACCCAGCC GACCTTTACG TAGAAGGTTG TGAGCAATAG 9000 

CGTGGTTGGT TTAACTCATC ACTTATCACA TCTGTTGCCA ACCATGGCGT AGCACCTTAG 9060 
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AAACAAATCT TGTCACAAGG TTTTGCCCTT GATGGTAAAG GTGAGAAGAT GTCTAAATCT 9120 

CTTGGAAATA CTATTGCTCC AAGCGATGTT GAAAAACAAT TCGGTGCTGA AATCTTGCGT 9180 

CTCTGGGTAA CAAGTGTTGA CTCAAGCAAT GACGTGCGTA TCTCTATGGA TATCTTGAGC 9240 

CAAGTTTCTG AAACTTACCG TAAGATTCGT AACACTCTTC GTTTCTTGAT TGCCAATACA 9300 

TCTGACTTTA ACCCAGCTCA AGATACAGTC GCTTACGATG AGCTTCGTTC AGTTGATAAG 9360 

TACATGACGA TTCGCTTTAA CCAGCTTGTC AAGACCATTC GTGATGCCTA TGCAGACTTT 9420 

GAATTCTTGA CGATCTACAA GGCCTTGGTG AACTTTATCA ACGTTGACTT GTCAGCCTTC 9480 

TACCTTGATT TTGCCAAAGA TGTTGTTTAC ATTGAAGGTG CCAAATCACT GGAACGCCGT 9540 

CAAATGCAGA CTGTCTTCTA TGACATTCTT GTCAAAATCA CCAAACTCTT GACACCAATC 9600 

CTTCCTCACA CTGCGGAAGA AATCTGGTCA TATCTTGAGT TTGAAACAGA AGACTTCGTC 9660 

CAATTGTCAG AATTACCAGA AGTTCAAACT TTTGCTAACC AAGAAGAAAT CTTGGATACA 9720 

TGGGCAGCCT TCATGGACTT TCGTGGACAA GCACAAAAAG CCTTGGAAGA AGCTCGTAAT 9780 

GCAAAAGTTA TCGGTAAATC ACTTGAAGCA CACTTGACAG TTTATCCAAA TGAAGTTGTG 9840 

AAAACTCTAC TCGAAGCAGT AAACAGCAAT GTAGCACAAC TTTTGATCGT GTCTGAGTTG 9900 

ACCATCGCAG AAGGACCAGC TCCGGAAGCT GCCCTTAGCT TCGAAGATGT AGCCTTCACA 9960 

GTTGAACGTG CTACTGGTGA AGTATGTGAC CGTTGCCGTC GTATCGACCC AACAACAGCA 10020 

GAACGCAGCT ACCAGGCAGT TATCTGTGAC CACTGTGCAA GCATCGTAGA AGAAAACTTT 10080 

GCGGAAGCAG TCGCAGAAGG ATTTGAAGAG AAATAAGATT GAAAAGTCTA GGCAAAATTC 10140 

AATTTGAGAA GAAAAGACAA CTAATTTTAT AGTCTATTAA ACGCATTGTA TCACGTTTTT 10200 

GAATACCTGA TATGATGCGT TTTTTATTTA TTTTAAAAAT TTGCGAGGTA TGACTTTTTA 10260 

TACTCAACAA GAATCAAAGA GAAACTTAGC AAGCTAACAG TAGTAAGATA AAATAGGAAT 10320 

TTGATATTAG GGATAAGATT GGTAAATAGT GTAATATTTT TACAACAATA AATTTATATA 10380 

GTTATTTCTG GTTTCTGAAA AGTATTATAT TTTATTTCAT ATTATACAAA TTTTTATTTT 10440 

ATAATATCAG AACATACTTT TTTTAAAAGC AAATATGATA CAATTTTATT TGAAAAAAAT 10500 

AAAAAAGGAG ATTTTATTAT AAAATTAAAA AGACTTGCTT TAATTAGTGG TATCGTCGGT 10560 

CTTGTGGGAG GAATTTTACT TCTTATTGGT CCTTTTGTCT TGTTGGGAAT AGCGGTAAAC 10620 

ACAGCTGCTA CAACTCTTAA TGGAGGAGCT ACTGCAGGGG CTTTTTCAGG TGTAGCCTTA 10680 

CTCTTGAATG CCTTGAAGAT TGCAAATCTT GTTCTTGGTA TCATTGCTAT TGTTTACTAT 10740 

AAAGGAGATA AGCGTGTAGG TGCAGCTCCG TCTGTACTAA TGATTGTTTC TGGTGGAGTT 10800 

AGTCTCATTC TATTCCGTTC TTAGGATGGG TTGGGGGGAT TTTTGCTATT ATCGGAGGAT 10860 
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CTCTATTCCT 


TTCAACATTG 


AAGAAATTCA AATCAGAAGA 


ATAAAAGGTA 


TTTTAGCATG 


10920 


AAAAGAACAA 


AAAAGTTTAT 


CGGTATAGGA 


GTAGCTCTAT 


TATCTCTTTC 


TCTTCTAGTT 


10980 


GCATGTGGAA 


CATAAAGTTC 


AAAGAATACT 


TCAACAAGTA 


ATGATGAGAA 


GACAGTAGCA 


11040 


ACATCCAATA 


GTTCAAAAGA 


AACAATCACT 


TTCGATACAC 


CGGTTGTAAC 


AGACGATGCG 


11100 


ATTGAATCAA 


TACGCACTTA 


TGCAGATTAT 


ATAGATCTTT 


ATAAAAATAT 


TTTTGATGAT 


11160 


TATTTTACTA 


AAGCTGAGGA 


AGGTTTCAAA 


GGCATAGCTA 


TGGAAAATAA 


TGACTCGTTT 


11220 


ACTAAACTAA 


AAGAGTCAAC 


TCAAAAATTA 


TTCGATGCGC 


AGAAAAAAAG 


GTTAAATAAT 


11280 


GAAGATAGAA 


TAGAAACAAC 


CAAAAACAAT 


GTGATTGCCA 


AACATTGTCA 


AACAGTCCTT 


11340 


TCCTTTTTGG 


TTTTGACTAG 


CTTTTTTGTG 


AAAAATTGTG 


TAAAATAGAA 


TAGATAAACG 


11400 


AGGGGAAACC 


TCGGAAAATT 


TAAAGGAGAA 


TCCATCTAAT 


GGTAAAATTG 


GTTTTTGCTC 


11460 


GCCACGGTGA 


GTCTGAATGG 


AACAAAGCTA 


ACCTTTTCAC 


TGGTTGGGCT 


GATGTTGATT 


11520 


TGTCTGAAAA 


AGGTACACAA 


CAAGGGATTG 


ACGCTGGTAA 


ATTGATCAAA 


GAAGCTGGTA 


11580 


TCGAATTTGA 


CCAAGCTTAC 


ACTTCAGTAT 


TGAAACGTGC 


TATCAAAACA 


ACTAACTTGG 


11640 


CTCTTGAAGC 


TTCTGACCAA 


TTGTGGGTTG 


CAGTTGAAAA 


ATCATGGCGC 


TTGAACGAAC 


11700. 


GTCACTACGG 


TGGTTTGACT 


GGTAAAAACA 


AAGCTGAAGC 


TGCTGAACAA 


TTTGGTGATG 


11760 


AGCAAGTTCA 


CATCTGGCGT 


CGTTCATACG 


ATGTATTGCC 


TCCAAACATG 


GACCGTGATG 


11820 


ATGAGCACTC 


AGCTCACACA 


GACCGTCGTT 


ACGCTTCACT 


TGACGACTCA 


GTTATCCCAG 


11880 


ATGCTGAAAA 


CTTGAAAGTG 


ACTTTGGAAC 


GTGCTCTTCC 


ATTCTGGGAA 


GATAAAATCG 


11940 


CTCCAGCTCT 


TAAAGATGGT 


AAAAACGTAT 


TCGTAGGAGC 


TCACGGTAAC 


TCAATCCGTG 


12000 


CCCTTGTAAA 


ACACATCAAA 


GGTTTGTCAG 


ATGACGAGAT 


CATGGACGTG 


GAAATCCCTA 


12060 


ACTTCCCACC 


ATTGGTATTC 


GAATTCGACG 


AAAAATTGAA 


CGTCGTTTCT 


GAATACTACC 


12120 


TTGGAAAATA 


AAAAATTGTA 


AGTCTAGAAT 


TGATTTCTAG 


GCTTTTTATG 


TTAGTATGGA 


12180 


AGTATGATAA 


GGAATAAAAA 


ACAAGATTAT 


GTACTGGCCT 


ACAAGCAACC 


AGCTTCAACC 


12240 


ACTTACATGG 


GTTGGGAAGA 


AGAAGCTTTA 


CCGATAGGCA 


ATGGTTCTTT 


AGGAGCAAAA 


12300 


GTATTTGGCC 


TTATAGGGGC 


TGAACGGATT 


CAATTTAATG 


AAAAAAGTCT 


CTGGTCTGGA 


12360 


GGTCCACTTC 


CTGATAGTTC 


AGATTATCAG 


GGTGGAAATC 


TTCAGGATCA 


GTATGTTTTT 


12420 


TTAGCTGAGA 


TTCGGCAGGC 


TTTGGAGAAG 


AGAGATTACA 


ATCTGGCTAA 


GGAACTGGCT 


12480 


GAGCAGCACC 


TAATTGGGCC 


AAAAACGAGT 


CAATATGGGA 


CCTATCTGTC 


TTTTGGGGAT 


12540 


ATTCACATTG 


AGTTCAGCCA 


GCAAGGTACG 


ACTTTGTCTC 


AGGTGACGGA 


CTATCAGAGA 


12600 
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CAGCTGAATA TTAGTAAGGC ACTTGCGACG ACTTCTTATG TCTATAAGGG AACGCGATTT 12660 

GAACGTAAAG CTTTTGCGAG TTTTCCAGAT GATCTCTTGG TTCAATGTTT TACTAAGGAA 12720 

GGGTTGGAAA CTCTAGATTT TACTATAGAA CTATCCTTGA CCTGTGATTT GGCTTCTGAT 12780 

GGAAAGTATG AGCAGGAAAA ATCTGATTAC AAGGAGTGTA AGTTGGATAT TACTGATTCT 12840 

CATATCTTGA TGAAGGGAAG AGTTAAGGAT AATGATCTGC GGTTTGCTAG TTATCTAGCT 12900 

TGGGAAACGG ATGGAGATAT TAGAGTTTGG TCAGATAGGG TTCAGATATC AGGAGCCAGT 12960 

TATGCCAATC TCTTCTTGGC CGCTAAGACG GATTTTGCCC AAAATCCTGC TAGCAATTAT 13020 

CGCAAGAAAC TAGATTTAGA GCAACAGGTG ATAGACTTGG TGGACACAGC TAAAGAAAAG 13080 

GGCTATACCC AATTGAAATC AAGGCATATC GAGGACTACC AAGCCTTATT CCAGCGTGTT 13140 

CAATTGGATT TGGAAGCTGA TGTTGACGCA TCCACTACAG ATGATTTGTT AAAAAATTAT 13200 

AAGCCACAAG AAGGGCAGGC TTTGGAGGAG CTGTTCTTCC AGTATGGACG GTATTTATTG 13260 

ATTAGTTCGT CCAGAGACTG CCCAGATGCT CTACCAGCTA ACCTACAGGG AGTCTGGAAT 13320 

GCGGTCGACA ATCCTCCTTG GAATTCGGAC TATCACTTAA ATGTCAATCT GCAGCTGAAT 13380 

TATTGGCCAG CCTATGTTAC CAATCTCCTA GAGACGGTCT TTCCAGTCAT CAACTATGTA 13440 

GATGATTTGC GTGTCTATGG TCGTCTAGCG GCTGTAAAGT ATGCAGGAAT CGTCTCTCAG 13500 

AAAGGTGAGG AGAATGGTTG GTTGGTTCAT ACTCAAGCGA CTCCCTTTGG TTGGACGGCA 13560 

CCTGGTTGGG ATT ACT ATT G GGGTTGGTCA CCAGCTGCCA ATGCGTGGAT GATGCAAACC 13620 

GTTTATGAAG CCTATTTATT TTATAGGGAC CAAGACTATC TCAGGGAGAA AATTTATCCC 13680 

ATGTTGAGGG AAACGGTTCG TTTTTGGAAT GCCTTTTTAC ATAAGGATCA GCAGGCGCAG 13740 

CGTTGGGTGT CTTCTCCGTC TTATTCCCCA GAACATGGGC CGATTTCGAT TGGCAATACC 13800 

TATGACCAAT CTCTGATTTG GCAGTTATTT CATGATTTTA TTCAGGCTGC TCAGGAATTG 13860 

GGACTGGATG AGGACTTGTT GACTGAGGTT AAGGAGAAGT CTGATTTACT AAATCCTTTG 13920 

CAAATCACTC AATCTGGTCG AATCAGGGAG TGGTATGAGG AGGAAGAGCA GTATTTTCAA 13980 

AATGAGAAAG TGGAGGCCCA GCATCGGCAC GCTTCCCATC TAGTGGGACT GTATCCTGGC 14040 

AATCTCTTTA GCTACAAGGG ACAAGAGTAT ATTGAAGCGG CGCGTGCTAG CCTCAATGAT 14100 

CGTGGAGATG GCGGCACAGG CTGGTCCAAG GCTAATAAGA TCAATCTCTG GGCGCGTTTG 14160 

GGAGATGGCA ATCGAGCCCA TAAATTATTG GCAGAGCAGT TAAAGACATC CACCTTGCAA 14220 

AATCTTTGGT GTAGCCATCC TCCTTTTCAG ATAGATGGTA ATTTTGGTGC TACTAGTGGC 14280 

ATGGCAGAAA TGTTACTCCA GTCTCATGCA GCTTATCTGG TACCTCTAGC TGCCCTACCT 14340 

GATGCTTGGT CAACAGGTTC TGTTTCAGGC TTAATGGCAC GTGGACATTT TGAAGTGAGC 14400 
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ATGAGCTGGG AAGATAAAAA ACTCTTACAG TTGACCATTT 


TATCAAGGAG 


TGGAGGAGAT 


14460 


TTGCGAGTTT CTTATCCAGA TATTGAGAAG AGTGTGATTA AAATGAATCA AGAAAAAATA 


14520 


AAAGCGAAAT GCATGGGGAA AGATTGTATT TCGGTGGCAA 


CAGCAGAAGG 


TGATCTTGTT 


14580 


CAATTTTATT TTTAAGAAGA TGTTATAAGG CAGTAATTTG 


AAACTGCCTT 


TTAATAAGGA 


14640 


TTTAAGAATA TAAGCAGTTT TCAACTAGTT GAAAAAACGT 


TATAATGATA 


ATAGGAAGTA 


14700 


ATACTCAATG AAAATCAAAG AGCACAAACT AGGAAGCTAG 


CCGCAGGTTG 


CTCAAAACAG 


14760 


TGTTTTGAGG TTGCAGATGG AAGCTGACGT GGTTTGAAGA 


GAGATTTTCG 


AGGAGTATAA 


14820 


TTTGTTTGAT AGAGGGTGGG TCTGATGGCT TATATTGAGA 


TGAAACACTG 


TTACAAGCGT 


14880 


TATCAGGTTG GGGACACGGA GATTGTGGCC AATTGTGATG 


TGAATTTTGA 


GATTGAAAAG 


14940 


GGGGAGCTGG TTATTATCCT TGGTGCTTCA GGTGCAGGCA 


AGTCAACAGT 


TCTTAACCTT 


15000 


CTTGGGGGAA TGGATACCAA TGATGAAGGG GAAATCTGGA 


TTGATGGTGT 


TAATATTGCG 


15060 


GATTATAGTT CCCACCAGCG CACCAATTAC CGTAGAAATG 


ATGTGGGGTT 


TGTTTTTCAG 


15120 


TTTTATAATC TAGTTTCTAA TCTGACAGCT AAGGAAAATG 


TGGAACTGGC 


TTCTGAAATT 


15180 


GTGACAGATG CCTTGAATCC TGATCAGGCC TTGACAGATG 


TAGGTCTGGC 


TCATCGTCTC 


15240 


AATAACTTTC CAGCCCAGCT TTCTGGAGGG GAGCAACAGC 


GAGTCTCCAT 


TGCACGCGCG 


15300 


GTAGCCAAAA ATCCTAAAAT TCTCCTTTGT GATGAACCGA 


CTGGAGCCTT 


GGATTATCAG 


15360 


ACGGGCAAGC AGGTTTTGAA AATTCTCCAA GACATGTCTC 


GTCAAAAGGG 


AGCGACGGTG 


15420 


ATCATOGTGA CTCATAATGG AGCTTTGGCG CCCATTGCTG 


ATCGCGTGAT 


TCAAATGCAC 


15480 


GATGCCAGTG TCAAGGATGT GGTGCTCAAC CAGCATCCTC 


AGGATATTGA 


CAGTTTGGAG 


15540 


TACTAGCATG ATCAAGCGAA AAACTTATTG GAAGGACTTA 


GTTCAGTCCT 


TCACAGGGTC 


15600 


CAAGGGGCGT TTTTTATCCA TCTTGATCCT GATGATGTTG 


GGATCTCTAG 


CCTTAGTAGG 


15660 


CCTCAAAGTA ACCAGTCCCA ACATGGAGGC GACAGCTAAT 


GCTTATTTAA 


CAACTGCTCA 


15720 


AACCTTGGAT TTGGCAGTCA TGTCTAACTA TGGCTTGGAT 


CAAGCAGACC 


AAGAAGAACT 


15780 


AAAACAGACG GAGGGCGCAG AGGTCGAGTT TGGCTATTTG 


ACAGATGTGA 


CTATGGATAA 


15840 


TGGGCAGGAT GCCATTCGGC TGTACTCCAA ACCAGAGCGA 


ATTTCAACCT 


TTCAGCTAAG 


15900 


AAAGGGACGA CTTCCTCAGT CAGACAAGGA AATCGCTTTG 


GCCACTCATT 


TGCAAGGCCA 


15960 


ATACAGCGTG GGACAGGAGA TTAGTTTTAA AGAAAAAGAA 


GAGGGTCATT 


CCTCTTTAAA 


16020 


AGACCATACT TATACCATTA CTGGTTTTGT GGATTCGGCT 


GAAATCCTGT 


CCCAGCGAGA 


16080 


TATGGGCTAC GCAGGAAGTG GAAGTGGGAC TCTGACAGCC 


TATGGGGTGA 


TTTTACCTAG 


16140 
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TCAATTTGAT CAGAAAGTCT ACAATATAGC TCGTTTGAAA TATCAAGATT TAGCGGGTTT 16200 

AAATGCCTTT TCATCAGCTT ATGAAGAAAA ATCCAAGCAA CATCAAGAAG AGCTTGAACA 16260 

AATTTTATCA GATAATGGCA AGGTACGTCT GCAACTTTTG AAAAAAGAAG GACAAGAGTC 16320 

TCTAGACAAG GGGCAAGAGA CCCTTGACAA GGCTCAGACT AATTTGCAGG AAGGCAAGCG 16380 

TCGTTTAGCA GCTGCTCAAG CTCGTATACA GGCTCAAGAA AGTCAACTAG CCTTGTTTCC 16440 

TCAAGTTCAG AGAGAGCAGG CTAGTGCTCA ACTTACCCAA GCCAAGCAGG AATTGGGCAA 16500 

GGAAGAGGAC AAACTAAAGC AAGCTGAACA AAATCTAGCC CAAGAAAAGG AAAAATTAGA 16560 

AAAACATCAG CAAGTCTTGG ATGATTTGGC GGAGCCAAGG TATCAGGTTT ATAATCGTCA 16620 

GACCATGCCA GGTGGTCAGG GCTATCTTAT GTATAGCAAT GCTTCATCCA GTATTCGAGC 16680 

AGTGGGCAAT ATCTTTCCTG TGGTACTTTA TGCCGTAGCA GCCATGGTGA CCTTTACGAC 16740 

CATGACTCGC TTTGTAGACG AAGAGCGAAC TCATGCAGGG ATTTTTAAGG CCTTGGGTTA 16800 

TCGTAGTAAG GATATTATCG CCAAGTTTCT CCTTTATGGA CTAGTAGCTG GGACTGTCGG 16860 

AACGGCTCTA GGTAGTATAC TTGGTCATTA TTTGCTAGCC AGTGTAATTT CAAGTGTCAT 16920 

TACAAAAGGC ATGGTGGTGG GAGAAACTCA GATTCAGTTC TATTGGACCT ATAGCTTACT 16980 

AGCTTTTGTC TTGAGCTTGT TGGCGAGTGT GTTACCAGCC TATCTGGTGG CTTGGAGGGA v 17040 

ACTTCATGAC GAAGCAGCCC AGCTTCTACT TCCTAAACCT CCTGTCAAAG GAGCTAAAAT 17100 

CTTATTGGAG CGTATCGGTT TTATCTGGCG TCGTCTCAGT TTTACTCATA AGGTAACAGC 17160 

CCGCAACATC TTTCGTTATA AGCAGAGAAT GTTGATGACA ATCTTTGGTG TGGCAGGTTC 17220 

TGTAGCTCTG CTCTTTGCAG GTTTGGGAAT CCAATCTTCT GTAGCAGGAG TTCCGTCTAA 17280 

ACAGTTTCAA CAAATCCAAC AGTATCAGAT GCTTGTCTCT GAAAATCCTA GTGCGACCAA 17340 

TCAGGACAAG GTAGAGCTAG CAGAAGTGTT GAAAGGGCAG GAGATACTAG CCTACCAGAA 17400 

AATCTATTCT AAAGCGCTAT ACAAGGATTT CAAAGGCAAA GCTGGTCTTC AAAACATTAC 17460 

TCTTATGATG ATAGAGAAGG AAGATTTGAC TCCCTTTATC CATCTTCAAC ATCATCAGCA 17520 

GGAGCTGACA TTAAAAGATG GCATCGTTAT TACAGCTAAA CTCGCCCAGC TGGCAGGTGT .17 580 

CAAGGTTGGG CAGACTTTAG AAATTGAAGG TAAGGAACTA AAGGTCGTTG CTATTACTGA 17640 

GAACTACGTT GGTCACTTTA TTTATATGAG TCAGGCTAGC TATGAGCAAC TTTACGGACA 17700 

GCTACCCCAA GCCAACACTT ATCTGGTCTC ATTAAGGGAT ACCAGTGCAA CTAGTATCGA 17760 

AAGTCAGGCG GGCTTGCTTA TGAATCAATC TGCGGTGTCC AGCGTTGTCC AAAATGCTTC 17820 

AGCCATTCGA CTCTTCGACT CTATCGCTAG CTCACTCAAT CAGACCATGA CCATCTTGGT 17880 

CATCGTATCG GTTCTATTAG CTATTGTCAT CCTTTACAAT CTGACCAATA TCAACGTAGC 17940 
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TGAGAGAATC 


CGTGAACTCT 


CCACTATCAA 


GGTTCTTGGT 


TTTCATAATA 


ATGAAGTCAC 


18000 


CCTCTACATT 


TACCGTGAGA 


CGATTGTGCT 


GTCCCTTGTG 


GGAATCGTAC 


TTGGTCTGAT 


18060 


AGCTGGTTTC 


TATTTACACC 


AATTTTTGAT 


TCAAATGATT 


TCGCCTGCGA 


CTATTCTCTT 


18120 


TTATCCGCAG 


GTAGGCTGGG 


AAGTCTATGT 


AATCCCAGTG 


GCAGCAGTAA 


GCATCATTTT 


18180 


GACCTTGCTT 


GGTTTCTTCG 


TCAATTATTA 


TCTGAGAAAG 


GTTGATATGT 


TAGAAGCCCT 


18240 


GAAATCTGTA 


GAGTAAGGTA 


GTTATTTTTA 


GCTGATTGAA 


CTTCTATTTA 


CTAATATTCA 


18300 


AAAATCCTCC 


GTTTCAAAGA 


GCAGGGAACT 


CTTTGTGACA 


GAGGATTTTT 


TCTATAGGGC 


18360 


TTTAGCAGCT 


GCAATTGCGG 


CTTCGAAGTT 


TGGCTCAGAA 


TTGATATTAT 


CCACGTATTC 


■ 18420 


AACGTAGCGA 


ATCGTATTGT 


CAGTATCGAG 


GACAAAGACT 


GCGCGTGCTA 


ATAGGTGCCA 


18480 


TTCGTTGATC 


AAGAGGGCAT 


AATCGCGCCC 


GAAAGAATGG 


TCAAAGTAGT 


CTGAAAGCAT 


18540 


AATGGCATTG 


TCAAGGCCTT 


CAGCACCGCA 


CCAACGTTTT 


TGAGCAAAAG 


GTAGGTCCAT 


18600 


TGAAACAGTC 


AATACGACCG 


TGTTGTCCAG 


TCCAGCCAAT 


TCTTCATTAA 


AACGACGTGT 


18660 


TTGAGTTGAG 


CAGATGCCTG 


TATCGATAGA 


AGGAACGACA 


CTCAAGACTT 


TTTTCTTGCC 


18720 


ATCAAAATCA 


GCCAGAGATT 


TTTTAGAAAG 


ATCTGTTGTA 


GTAAGAGAAA 


AATCAAGCGC 


18780 


CTTGTCGCCG ACTTGTAGTT 


GTTTACCTGT 


AAAGCTCACA 


GGATTTCCGA 


GAAAAGTTAC 


18840 


CATAGGATAC 


TCCAATCTTT 


TTTCTTCCAT 


TTTAGCTGAA 


ACAGTCGGAA 


TTTTCCAATG 


18900 


ATTTGACCGG 


AAATATGGGC 


ATAGAAAAAA 


CGCCAGCTCA 


TGTGAGAATG 


ACGTTTTTCA 


18960 


TAGGTTTATT 


TTGCCAATCC 


TTCAGCAATC 


TTGTCAAGGT 


TGTATTTCAT 


CATGCTGTAG 


19020 


TAGCTGTCGC 


CTTCTTTACC 


TTGTTCTGCG 


ATAGAGTCAG 


TAAAGATTTG 


AGCGTAGATT 


19080 


GGGATGTTTG 


TGTCTTGAGA 


AACAGTTTTC 


ATTGGACGGT 


CATCCACACT 


TGATTCTACA 


19140 


AAGAGTGATG 


GAACTTTTGT 


TTGGCGAAGT 


TTTTCAACCA AGGTCTTGAT 


TTGTTCAGGA 


19200 


GTTCCTTCTT 


CTTCAGTATT 


GATTTCCCAG 


ATGTAAGCAC 


TTGGGACACC 


ATAGGCTTTA 


19260 


GAGAAGTATT 


TGAATGCTCC 


TTCGCTGGTT 


ACAATGAGTT 


TCTTTTCAGC 


AGGGATCTTA 


19320 


TTAAATTTAT 


CCTTACTTTC 


TTTATCAAGT 


TTGTCTAACT 


TATCAGTATA 


TTCTTTGAGA 


19380 


TTTTTTTCAT 


AGAATTCTTT 


ATTGTTAGGG 


TCTTTGGCGC 


TCAATTGTTT 


GGCGATATTT 


19440 


TTAGCAAAAA 


TAATACCGTT 


TTCAAGGTTA 


AGCCAAGCGT 


GTGGGTCTTC 


TTTTCCTTTT 


19500 


TCATTTTGAC 


CTTCAAGGTA 


GATAACATCA 


ACGCCGTCGC 


TGACTGCGAA 


GTAGTCTTTG 


19560 


TTTTCAGTTT 


TCTTGGCATT 


TTCTACCAAT 


TTTGTAAACC 


AAGCATTGGC 


ACCTGTTTCA 


19620 


AGGTTGATAC 


CGTTATAGAA 


AATCAAATTA 


GCCTCAGAAG 


TTTTCTTAAC 


GTCTTCAGGA 


19680 
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AGTGGTTCGT 


ATTCGTGTGG 


GTCTTGCCCA 


ATCGGAACGA 


TACTATGAAG 


Vj rcAATTTTG 


19740 


TCACCAGCAA 


TATTTTTAGT 


AATATCAGCG 


ATGATTGAGT 


TTGTAGCAAC 


AACTTTTAGT 


19800 


TTTTGACCAG 


AAGTTGTATC 


TTTTTTTCCG 


CTAGCACATG 


CTACAAGAAT 


GATTGCAGAA 


19860 


AGAAAGAGAA 


CGAGTAATGT 


ACCTAATTTT 


TTCATTAGAT 


CCTCCAATTT 


ATTAGGGCTT 


19920 


TGCCCCTTAT TTTAACAAAT GTTTATTTTT CAGTTTCAAA TATCGTTGTT 


TGGGAGCGAT 


19980 


AAAGAAGCTA 


ATGAGAAAGA 


AACTAGCAGC 


TGTAAGCACG 


ATACTAGAAC 


CTGCCGCAAC 


20040 


ATTAAAACTA 


TAGCCAATAA 


AGAGTCCCAA 


AACTGAAGCA 


GTAGCTCCGA 


AGGTTGAGGA 


20100 


AAGGAAAATC ATACTTTTCA GACTATTAGC ATACAGATAA GCAGTTGCAG 


CTGGGGTAAT * 


20160 


CAGCATGGCT 


ACAATCAGGA 


TAGTTCCGAC 


ACTTTGCATG 


GCTGTCACAG 


ACACGAGAGT 


20220 


CAGGAGTACC 


ATGAGAAGGT AGTGATAGAA ATTGACAGGC 


ATTCCCATGG 


CTTTAGCCAA 


20280 


GAGTTCATCA 


AAGGAAGTTA 


TCAAGAGTTG 


CTTGAAGAAA ATCCAGATTA 


ACAAGAGGAT 


20340 


AGCTGCCCCC 


ACACCCATAG 


TAATAAACAT 


ATCCGTATCT 


TGGACGGCCA 


GGATATTACC 


20400 


AAAAAGGATA 


TGGAAAAGGT 


CAGTTGAACT 


TTTAGCGACA 


CCAATCAAGA 


TGATACCGAG 


20460 


GGCTAAGAAA 


GAAGAAAAGG 


TAATGCCGAT 


GGCGGTATCG 


CTTTTGATAA 


TCGAGTTTCC 


20520 


TTTGATGTAG 


GTAATGATGA 


TGGCAGCTAG 


CAATCCAAAG 


ACAATGGCTC 


CGATAAAGAA 


20580 


GTCAAGGCCC 


AAGATGAAGG 


ATAGGGCTAC 


ACCTGGTAAG 


ACAGCATGTG 


AAATGGCATC 


20640 


TCCCATGAGT 


GACATCCCGC 


GTAGAATAAT 


GAAACATCCC 


ACAGCTCCAG 


CTACAATCCC 


20700 


GACGACAATA 


GCTGTTATCA 


AGGCATTTTG 


TAGGAAATGG 


AATTTTTGCA 


ATCCATCGAT 


20760 


AAATTCTGCA 


ATCATAGGTC 


ACCTCCATTG 


AAAAAGAGTT 


GATTAGCGTA 


AGCTTCTTTT 


20820 


AGATTGGTTT 


CGGTAAAAGT 


TTCTTTTGTT 


GGACCAAAGG 


CAATCACTTC 


TCGATTGACA 


20880 


AGTAAGACTT 


GATCGAAGTA 


GTGGGGAATC 


TTGCTGAGGT 


CGTGGTGAAC 


GATGAGAACC 


20940 


GTCTTCCCAG 


CTTTTTTCAA ATCTCTCAGC 


GTATTCATGA 


TGATTTCCTC 


ACTGACAGAG 


21000 


TCAATCCCAG 


CAAAGGGTTC 


ATCCAAGAGG 


ATATAGTCGG 


CTTCCTGCAC 


CAAACATCTG 


21060 


GCAATCAAGA 


CCCGCTGGAA 


TTGACCTCCA GACAGTTGAC 


TAATTTGACG 


TTGAGCGTAG 


21120 


TCAGCTAGGC 


CGACGATTTC 


AAGGGCCTCT 


TGCACTTTCT 


TCCAATGTTT 


AGCCTTTAAA 


21180 


CTTCGAAAGA 


GAGGAATAGA 


GGGAAATAGT 


CCTAACGAGA 


CGCATTCCTT 


GACCTTGATG 


21240 


GGAAAGTTGT 


AGTCGATATT 


GATTTTTTGT 


TCGACATAGG 


CAATTCGGTG 


TAAGGATTTT 


21300 


TTAACTTCCT 


TGTCATCGAG 


AAATGCCTGA 


CCTTGATGTG 


GGATAATTCC 


CAACATACCT 


21360 


TTTAATAGTG 


TTGATTTCCC 


AGCGCCGTTT 


GGACCAATGA 


TGCCGGTAAT 


TGTTGGTCCA 


21420 


TGGAGCACTA 


GTGAAATATC 


CTTAAGTGCC 


AACGTTTCTT 


TGTAGGAGAC 


ACTGAGGTTT 


21480 
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TCGATACGTA TCATAAACTT GTATTCCTCC TGTCTCTTAA TATACATTAA AAAAAAAATT 

AAGTCAAGTT AATTTTTGAA AAAATTAAAA TAATAACTGA AAAATAGATT CTAAAGATAA 

CTTTCAGCAT AAATTTCTAA ATTATAAAAC GCATAGTATC AAGTGTAAAA AACTTGGAAT 

TATGCGTTTT ATCATGGAAA GATTTTTTAT AATAGCTAAA AAATAA 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6171 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNES S : double 

(D) TOPOLOGY: linear 



21540 
21600 
21660 
21706 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GATCCCCAGG AAAAACCGAG GTTTTCCCAA TCAATCGTTA CTGTCATATT CCACTCCTTA 
TTCTAAAAAC CTATTTCTTA TATTCTACAC TATTTTTCTA AAATAGCAAG TATATTTTGT 
AATTTTCAGA AAATTTCTCC AATAAAAACC AACTCTTAGA ACTGATTCTT CATTTCACTT 
ATTTATCTTC AGTAACTACT TCCTGAAGAT AAGCGTCAAA AACTTCTTCA TCTGAAATCG 
TGTCAGAAAT GAAGCTTCCA TTGCTAGTGC GTTCTGACAA GTTCAAGTCT TGCAATCGGC 
TTTCATAGAT TGTTCCTTTA TTGGATTGGA CAAGCAGAGT TTGGTCGTTC ACATCCACTT 
CCGTACTGAA GAAATCGCCA ACAAATCCTT GCTCTGCAAC TGCTCCTGCC AAGAAGACAC 
GATGCGGTTT GTTTTTCAAC TCACGCAAGA CTTGTAATCC TCGTTTGGCA CGGCTGGTTG 
CTAGAATTTC CTCAATGGAA ACACGTTTCA AGCTTCCACG CTGGGTCAAG AGGTAGAAGG 
ACGAAGTATT ACAGATAAAG CCAGATTGGA GGACATCATC TTCTTTCAAA TTCATAGCCT 
TGACACCTGC TGCCTTAGCA CCGACAACCG GAACCTCTTC GATATTGAAA CGCAGGGCAT 
AACCATTTTG ACTAACCAAG ACAACATCAT CTAGTTTAAT CGGAGCCACT GCTACAATCT 
GATCTGTATC GTCTTTGAGC TTAGCATACT TGACAGACTT AGATCTATAG GTCCGCCATG 
GAGTGAATTC TTTTCGCTCT ACCCGTTTGA TTTGACCAAG GCGAGTCACT GCAAAGTAGG 
TTGTCGCATC GTCAAACTGA TCCAGTACTT CCACATAAAG GATTTCTTCA TTCGTTTCAA 
AGTTTGTGAT GGTTTGGCTC AGATGCTCTC CGATGTCCTT CCAACGAATA TCTGCCAACT 
CATGGATTGG TCTGTAGATG ACATTTCCAA GACTTGTGAA CATCAAGAGG TGCTGGGTTG 
TCTTGGCAGA TTGAACAAAA ATCAAACGGT CATGATCACG CTTGCCAATT TCTTCCAAGG 
TGGAAGCCGC AAAGGAACGT GGACTGGTAC GCTTGATGTA ACCTGCCTTG GTCACGCTGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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CGTAGGTATC TTCCTCAGCG ATAAGACTAG CTGTATCAAT CTCAATTGCT TTCGCAGTGT 


i *5 nn 


CTTCTAAAGA ACTCAAACGA GGAGTTGCAA ATTTCTTCTT GACCTCACGA AGTTCTTTCT 


XZ OU 


TCATGAGATT 


GTACATAGTC 


CTTTCATCAC CGATAATAGC CGCCAGCATA GCAATCTTCT 


lJzU 


CACGAAGCTC 


TGCTTCTTCT 


TCCTGCAAGA CAACCACATC GGTATTGGTC AAACGGTACA 


1380 


GTTGCAAAGT 


TACGATAGCC 


TCAGCCTGTT CTTCCGTAAA ATCATAGCTA ACTTTGAGGT 


1440 


TTTCCTTGGC 


GTCCGCOTTA 


TTCTCAGAAG CACGGATAAG AGCAATGACT TCATCCAAAA 


1500 


TCGAAATCAC 


ACGAATCAAA 


CCTTCGACGA TATGGAGACG TTTCTCAGCC TTTTCTTTGT 


1560 


CAAAGGGTGA 


ACGCGCCAAA 


ATCACTTCTC GACGGTGAGC GATATAGCTA GACAGGATTG 


1620 


GAACAATCCC 


AACCTGACGA 


GGTGTGAAAT TGTCAATCGC C AC CAT ATT A AAGTTGTAGT 


1680 


TGATTTGTAG 


GTCGGTGTAG 


TTAAATAAGT AGTTGAGAAC AAGCTCAGTA TTAGCGTCTT 


1740 


TCTTAAGTTC 


GATAGCGATA 


CGAAGACCAT CACGGTCAGA CTCATCACGA ACCTCAGCAA 


1800 


TCCCAGCTAC 


CTTGTTATTA 


ACACGAACAT CATCGATTTT CTTGACTAGA TTGGCCTTAT 


1860 


TGATTTCATA AGGAATCTCA ATAATAACGA TTTGTTCCTT ACCACCTTTT AGCTTTTCAA 


1920 


TTTCAGTCTT 


GGAACGAACA 


ACCACGCGCC CTTTCCCAGT CTCATAAGCT TTCTTGATTT 


1980 


CATCACGACC 


CTGAATAATA 


GCCCCTGTAG GGAAGTCTGG TCCAGGCAAG AATTCCATGA 


2040 


GTTTATCAAT 


CTTTGCAGTT 


GGGTGGTCAA TCATGTAAAC TGCAGCATCT ATGACCTCAG 


2100 


CTAAATTATG 


GGGAGGAATG 


TCTGTGGCAT AACCAGGCGA AATCCCAGTC GAACCATTGA 


2160 


CCAAGAGGTT 


TGGAAAGGCT 


GCTGGCAAGA CCGTTGGTTC TTTCTCCGTA TCGTCAAAGT 


2220 


TCCATGCAAA AGGAACTGTC TTTTTCTCGA TATCCTGAAG AAGGTAGCCT GCAATTTCAG 


2280 


ACAAACGTGC 


CTCAGTATAA 


CGCATAGCCG CAGGAGGATC TCCGTCCATA GAACCGTTAT 


2340 


TACCGTGCAT 


TTCAACTAGA 


ATCTCACGAT TTTTCCAGTT CTGTGACATA CGAACCATGG 


2400 


CATCATAGAT 


AGAAGAATCC 


CCGTGTGGGT GGAAATTCCC CATGATGTTC CCGACTGACT 


2460 


TGGCCGACTT 


ACGGTAGCTC 


TTGTCAAAAG TATTGCTATC CTTATTCATA GAATAAAGAA 


2520 


TACGGCGCTG AACCGGCTTC AACCCATCAC GAATATCTGG CAAAGCCCGG TCTTGAATAA 


2580 


TGTACTTGGA 


GTAGCGACCA 


AAGCGCTCTC CCATGATGTC CTCCAGGGAC ATGTTTTGAA 


2640 


TGTTAGACAT AAGATACAAA GCCCATAAAA TACCAAGTGA AAATAGAAAA TTCTTGAAGT 


2700 


AAGCAAACTC 


ACAAGAGAAT 


TTATCTTTTT CACACAGTAT CTAGGGCGTG TTCAACTCCT 


2760 


TTCAAAGAAT GTAGAGTAGG TTTTTATGCA GTAAAAGATA TTTTACGGGA ATTCCTCCCG 


2820 


TGTTCAGTTA 


CGATAAGTAA 


CCAAACTATC CTGTTTGTAT TTTTCAATAT GAAAATCTGG 


2880 


TTTTCCAAAA 


TTAGTCTTAG 


TTTGTGTCTT AGCCGCTCCC TTAAGCGCCT CTTTGAGATA 


2940 
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AGCACTCATA GCAGATTCTT CATTAATAAT 


CCTGCAATTT 


TTTCAAACCA 


AGATTTTCAA 


3000 


ACTGCTTTTT 


CACATAGTCA 


TTCACATCCG 


ACTCTAATTT 


CCAGTTTACT 


AACATATTAT 


3060 


TTTCTTTCAT 


TAAAACACTG 


TCGTTTCTTC 


TAGCGTAAAC 


TTGACATTAT 


CTTCAATCCA 


3120 


TTTACGGCGT 


GGTTCTACCT 


TATCTCCCAT 


GAGAACATTG 


ACGCGGCGTT 


CGGCGCGCGC 


3160 


TAAATCTTCA 


ATTGTGACAC 


GGATGAGGGT 


ACGTGTTTCT 


GGGTTCATGG 


TTGTTTCCCA 


3240 


GAGCTGGTCC 


GCATTCATCT 


CACCAAGTCC 


TTTGTATCGT 


TGGAGGGTAG 


CGCCTTTACG 


3300 


GAACTGTTTA 


CGGAGTTCTT 


CTAGTTCTCC 


GTCCGTCCAA 


GCGTAGGGCA 


CTTCTTCTTT 


3360 


CTTGCCTTTA 


CCTTTGGACA 


TCTTGTAAAG 


AGGTGGGAGG 


GCAATATAGA 


CATGACCTGC 


3420 


CTCGACTAGC 


GGACGCATGT 


AACGGTAGAA 


AAATGTCAAG 


AGCAAGGTCT 


GGATATGGGC 


3480 


ACCGTCGGTA 


TCCGCATCGG 


TCATGATAAT 


GATCTTATCA 


TAGTTGGCAT 


CTTCAATAGA 


3540 


GAAGTCTGCT 


CCAACACCCG 


CACCAATGGT 


ATAAATCATG 


GTATTGATCT 


CTTCATTTTT 


3600 


GAGGATATCC 


GCCATGTTGG 


CCTTGGCTGT 


ATTGACAACC TTACCACGAA GAGGTAGAAT 


3660 


AGCCTGGAAC 


TTGCGGTCAC 


GACCTTGTTT 


GGCAGAACCA 


CCGGCAGAGT 


CCCCCTCAAC 


3720 


TAGATAGAGT TCATTCTTAG CAGGATTCTT AGATTGGGCT GGGGTCAATT TCCCAGACAA 


3780 


CAAGCCCTTA 


TCTTTCTTGT 


TTTTCTTCCC 


ATTTCGGCTC 


TCATCACGGG 


CCTTACGTGC 


3840 


TGCTTCACGA 


GCATCACGGG 


CCTTGATAGC 


CTTGCGGATG 


AGGTTAGAAG 


CTAATTCCCC 


3900 


ATTTTCCATA 


AGGAAAAAGG 


TCAACTTATC 


AGCCACTATT 


CCATCCACAA 


CTGGGCGAGC 


3960 


TAGGGGGCTT 


CCTAGTTTAT 


CCTTGGTCTG 


TCGTTCAAAC 


TGCAAGTGTT 


CTTCAGGAAC 


4020 


TAAGATAGAA AGAACGGCCG CTAGTCCCTC 


ACGATAGTCT 


GAACCTTCAA 


GGTTTTTATC 


4080 


TTTTTCCTTG 


AGAAGACCTG 


TTTTACGTGC 


ATAGTCATTG 


ATGACCTTGG 


TAATGGCAGA 


4140 


CTTGAGTCCT 


GTCTCGTGCG 


TTCCACCGTC 


CTTGGTGCGA 


ACGTTATTGA 


CAAAAGATAG 


4200 


AATGTTATCT 


GAGAATCCGT 


CATTGTACTG 


GAGGGCTACT 


TCGACTTGAA 


AACCATTGTC 


4260 


TTCCCCTTCA 


AAGTAAAGAA 


CTGGCGTCAA 


GATTTCCTTA 


TCTTCGTTGA 


GATAAGAAAC 


4320 


AAAATCTTGT 


ACTCCATTCT 


CATAGTGGAA 


CTCAATCGCT 


TCATTTGTTC 


GCTTGTCCGT 


4380 


TAAAGACAAG 


GTCACATTTT 


TCAAGAGAAA 


GGCTGATTCA 


TTAAGGCGCT 


CTGAAATGGT 


4440 


ATTGTACTTG 


AAATCTGTCG 


TAGAAAATAT 


AGTCGCGTCA GGCATAAAAG TAACTTTGGT 


4500 


GCCTGTTTTA 


GACTTGGGTG 


CTGTACCGAT 


TTTCTTCAAA 


GTCGTGACAG 


GTTTTCCACC 


4560 


ATTTTCGAAA 


CGTTGCTTGT 


AAACTGCGCC 


ATCACGGGTA 


ATTTCAACTT 


CTAACCAGCT 


4620 


AGAAAGGGCG 


TTAACAACGG 


AAGAACCCAC 


TCCGTGAAGT 


CCACGTGATG 


TCTTATAGCC 


4680 
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ACCTTGACCG AATTTCCCTC CGGCATGAAG AATGGTAAAG ATAACCTCAA CAGTTGGAAT 4740 

TCCCATAGCG TGCATACcTG TCGGCATCCC ACGTCCATGG TCTTGAACCG TTAGACTACC 4800 

GTCTTTATTG ATAGTTACAT CAATACGATC ACCAAACCCA GACAAGGCTT CATCGACTGC 4860 

ATTATCAACG ATTTCCCAAA CTAGGTGATG AAGACCAGCG CCATCGGTCG ATCCAATATA 4920 

CATCCCTGGA CGTTTTCGGA CCGCATCCAA CCCTTCTAGC ACCTGAATAG CATCATCATT 4980 

ATAATTGTTA ATATTGATTT CCTTTTTTGA CACAAGGAAC CTCCTATTCG TTCATCTTTA 5040 

CTATTCTACA GGTTTTCCAA GGATTTTGCA AAATTTTTCT TTCTCCGATG TGACAATTTC 5100 

AGCAGAGATT CTCTGCTTTT CTTTCCCAAT TCATGATATA ATAGGAGTAT GATTACAATA 5160 

GTTTTATTAA TCCTAGCCTA TCTGCTGGGT TCGATTCCAT CTGGTCTCTG GATTGGACAA 5220 

GTATTCTTTC AAATCAATCT ACGCGAGCAT GGTTCTGGTA ACACTGGAAC GACCAACACC 5280 

TTCCGCATTT TAGGTAAGAA AGCTGGTATG GCAACCTTTG TGATTGACTT TTTCAAAGGA 5340 

ACCCTAGCAA CGCTGCTTCC GATTATTTTT CATCTACAAG GCGTTTCTCC TCTCATCTTT 5400 

GGACTTTTGG CTGTTATCGG CCATACCTTC CCTATCTTTG CAGGATTTAA AGGTGGTAAG 5460 

GCTGTCGCAA CCAGTGCTGG AGTGATTTTC GGATTTGCGC CTATCTTCTG TCTCTACCTT 5520 

GCGATTATCT TCTTTGGAGC TCTCTATCTT GGCAGTATGA TTTCACTGTC TAGTGTCACA 5580 

GCATCGATTG CGGCTGTTAT CGGGGTTCTG CTCTTTCCAC TTTTTGGTTT TATCCTGAGT 5640 

AACTATGACT CTCTCTTCAT CGCTATTATC TTAGCACTTG CTAGTTTGAT TATCATTCGT 5700 

CATAAGGACA ATATAGCTCG TATCAAAAAT AAAACTGAAA ATTTGGTCCC TTGGGGATTG 57 60 

AACCTAACCC ATCAAGATCC TAAAAAATAA AATGCCAGTT CTGTACTGCC CCCAAACAGT 5820 

TAGACAAATA ATTTATCCAA AGGATTTAGT TCTGTACTGC ACAGGACTAA GTCCTTTTAG 5880 

TTTTACCTTA ATTCGTTTGT TGTTGTAGTA ATCAATATAG TCTATAATGG CTTGTTCCAA 5940 

TTGATTAAGT GATTTAAATG TTTTCTCATA GCCATAAAAC ATTTCGGATT TTAAAATGCC 6000 

AAAGAAAGAT TCCATCCTAC CGTTGTCTTG GCTGTTGCCC TTACGTGACA TGGATGCTTG 6060 

AATTCCCTTA CTCTCTAGGA ACCGATGATA AGAATCGTGT TGGTATTGCC AGCCTTGGTC 6120 

ACTATGGAGA ATCGTATTCT CGTAGTGCTT CTCTGTGAAT GCCTGTTCCA A 6171 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18475 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(DJ TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



TATTACAAAT 


AAAAAAACGG 


AGGAGTGCTT TATGAAAGCC 


TATACTTATG 


TTAAACCAGG 


60 


ACTTGCTTCT 


TTTGTTGATG 


TAGACAAACC AGTTATTCGC 


AAGCCAACAG 


ACGCTATTGT 


120 


GCGTATTGTA 


AAAACCACTA 


TTTGTGGAAC AGACCTCCAT 


ATTATCAAAG 


GGGAtGTTCC 


180 


TACTTGCCAA 


AGTGGTACCA 


TTCTTGGCCA CGAAGGGATT 


GGGATTGTTG 


AAGAAGTTGG 


240 


GGAAGGAGTT 


TCCAACTTCA 


AAAAAGGTGA CAAGGTCTTG 


ATTTCTTGCG 


TCTGTGCCTG 


300 


TGGTAAATGC 


TACTACTGTA 


AAAAAGGAAT TTATGCTCAC 


TGTGAAGACG 


AAGGGGGCTG 


360 


GATTTTCGGT 


CACTTGATTG 


ATGGTATGCA GGCTGAATAT 


CTACGTGTCC 


CTCATGCAGA 


420 


TAATACTCTT 


TACCATACTC 


CAGAAGACTT GTCAGATGAA 


GCTTTGGTTA 


TGCTGTCAGA 


480 


CATTCTGCCT 


ACTGGATATG 


AAATTGGTGT CTTAAAAGGG 


AAAGTAGAAC 


CTGGTTGCAG 


540 


CGTAGCCATT 


ATTGGTTCAG 


GTCCAGTTGG ATTGGCTGCT 


CTTTTAACAG 


CCCAATTCTA 


600 


TTCACCAGCT 


AAATTGATTA 


TGGTAGACCT AGACGATAAC 


CGCTTGGAAA 


CTGCCCTATC 


660 


ATTCGGTGCG 


ACTCATAAGG 


TTAATTCTTC AGACCCTGAA 


AAAGCCATTA 


AAGAAATTTA 


720 


TGATTTGACA GATGGTCGTG GTGTGGATGT CGCTATCGAA GCTGTTGGTA 


TTCCTGCAAC 


780 


ATTTGATTTC 


TGTCAAAAGA 


TTATCGGTGT AGACGGAACG 


GTTGCCAACT 


GTGGTGTGCA 


840 


TGGTAAACCA 


GTTGAATTCG 


ATTTAGATAA ACTTTGGATT 


CGCAACATCA 


ATGTAACAAC 


900 


TGGTTTGGTA 


TCTACAAATA 


CGACTCCACA ATTGTTGAAA 


GCACTTGAAA 


GTCATAAGAT 


960 


TGAACCGGAA 


AAATTGGTAA 


CTCACTATTT CAAACTCAGT 


GAAATTGAAA 


AAGCCTACGA 


1020 


AGTCTTCAGT 


AAGGCAGCAG 


ACCACCATGC CATTAAGGTC 


ATTATCGAAA 


ACGATATCTC 


1080 


AGAAGCCTAA 


GTAGTAAAAA 


TATTTTTGTA CATAAGTAAA 


TAGAAATTCA 


GTCATCCATC 


1140 


AGATGGCTGG 


ATTTTTTATC 


AAAAAATTAA GAAATGAGCA 


TATTTCTTTC 


CTTGTCTGGC 


1200 


GGAATTGGTT 


ATAATATACG 


GTACAAAGGA ATGAATGAAT 


ATGTATCGTG 


TTATAGAAAT 


1260 


GTACGGAGAT 


TTTGAACCGT 


GGTGGTTCTT AGAAGGTTGG 


GAAGAAGATA 


TTGTAGCAAG 


1320 


TAGAAAATTT 


GACCAGTATT 


ATGATGCTCT CAAATACTAC 


AAAACTTGCT 


GGTTTAGATT 


1380 


GGAACAAGAA 


TCGCCTCTTT 


ATAAAAGTAG AAGCGACTTG 


ATGACCATTT 


TTTGGGACCC 


1440 


GGAAGACCAA CGCTGGTGTG ATGAATGTGA TGAGTATTTA CAACAATACC ATTCTTTGGC 


1500 


TCTTTTGCAG 


GATGAGCAGG 


TTATCCCAGA CGAAAAACTA 


CGCTCAGGCT 


ATGAAAAACA 


1560 


AACCAGTCAG GAAAGGAATC GTTCTTGCCG TATGAAATTA AAATAGAGAA AAGTAACTTT 


1620 


TTTGGAGTTG 


CTTTTTTTAT 


TTTTCTAACT CTTTGCGAAT 


AGTATAGGTG 


AGGAGGTAAG 


1680 
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TATGGTTCAA GAAATTGCAC AAGAAATCAT TCGTTCAGCT 


CGGAAAAAAG 


GGACGCAGGA 


1740 


TATCTATTTT GTCCCTAAGT 


TAGACGCCTA TGAGCTTCAT 


ATGAGGGTAG 


GAGACGAGCG 


1800 


CTGTAAAATT GGTAGCTATG 


ATTTTGAAAA GTTTGCAGCC 


GTTATCAGTC 


ACTTTAAGTT 


1860 


TGTGGCGGGT ATGAATGTGG GAGAAAAAAG ACGTAGTCAA CTGGGTTCCT 


GTGATTATGC 


1920 


CTATGACCAT AAGATAGCGT 


CTCTACGTTT ATCTACTGTA GGCGATTATC 


GGGGGCATGA 


1980 


GAGTTTGGTT ATCCGTTTGT 


TGCACGATGA GGAGCAGGAC 


CTGCATTTTT 


GGTTTCAGGA 


2040 


TATTGAAGAA TTAGGCAAGC 


AGTACAGGCA ACGGGGACTC 


TATCTTTTTG 


CTGGTCCGGT 


2100 


TGGGAGTGGT AAGACGACCT 


TGATGCATGA ATTGTCCAAG 


TCACTCTTTA 


AAGGACAGCA 


2160 


AGTTATGTCC ATCGAAGATC 


CTGTCGAAAT CAAGCAGGAC 


GACATGCTTC 


AGTTGCAGTT 


2220 


GAACGAAGCA ATCGGCCTAA CCTATGAAAA TCTAATCAAA CTTTCCTTGC 


GTCATCGACC 


2280 


AGATCTCTTG ATTATCGGAG 


AAATTCGTGA CAGCGAGACG 


GCGCGTGCAG 


TGGTCAGAGC 


2340 


TAGTTTGACA GGTGCGACAG TCTTTTCAAC CATTCACGCC AAGAGTATCC GAGGTGTTTA 


2400 


TGAGCGTCTG CTGGAGTTGG 


GTGTGAGTGA AGAAGAATTG 


GCAGTTGTTC 


TGCAAGGAGT 


2460 


CTGCTACCAG AGATTAATCG 


GGGGAGGAGG AATCGTTGAC 


TTTGCAAGCA 


GAGATTATCA 


2520 


AGAACACCAA GCAGCCAAGT 


GGAATGAGCA AATTGACCAG 


CTTCTTAAAG 


ATGGACATAT 


2580 


CACAAGTCTT CAGGCTGAGA 


CGGAAAAAAT TAGCTACAGC 


TAAGCAAAAA 


AATATCATCA 


2640 


CCCTATTTAA CAATCTCTTT 


TCTAGCGGTT TTCATCTGGT 


GGAGACTATC 


TCCTTTTTAG 


2700 


ATAGGAGTGC TTTGTTGGAC 


AAGCAGTGTG TGACCCAGAT 


GCGTGTGGGC 


TTGTGTCAGG 


2760 


GGAAATCATT CTCAGAAATG 


ATGGAAAGTT TGGGATGTTC 


AAGTGCTATT 


GTCACTCAGT 


2820 


TATCCCTAGC TGAAGTTCAT 


GGCAATCTCC ACCTGAGTTT 


GGGAAAGATA 


GAAGAATATC 


2880 


TGGACAATCT GGCTAAGGTC 


AAGAAAAAAT TGATTGAAGT 


AGCGACCTAT 


CCCTTGATTT 


2940 


TGCTGGGTTT TCTTCTCTTA 


ATTATGCTGG GGCTACGGAA 


TTACCTGCTC 


CCACAACTGG 


3000 


ATAGTAGCAA TATTGCCACC 


CAAATTATCG GTAATCTGCC 


CCAAATTTTT 


CTAGGCATGG 


3060 


TAGGGCTTGT TTCCGTGCTT GCCCTTTTAG CACTCACTTT TTATAAAAGA AGTTCTAAGA 


3120 


TGAGTGTCTT TTCTATCTTA GCACGCCTTC CCTTTATTGG AATCTTTGTG CAGACCTACT 


3180 


TGACAGCCTA TTATGCACGT 


GAATGGGGGA ATATGATTTC 


ACAGGGAATG 


GAGTTGACGC 


3240 


AGATTTTTCA AATGATGCAG 


GAACAAGGTT CCCAGCTCTT 


TAAAGAAGTC 


GGTCAAGATC 


3300 


TGGCTCAAAC CCTGAAAAAT 


GGCCGTGAAT TTTCTCAGAC 


GATAGGAACC 


TATCCTTTCT 


3360 


TTAGGAAGGA ATTGAGTCTC ATCATAGAGT ATGGGGAAGT TAAGTCCAAG 


CTGGGTAGTG 


3420 


AGTTGGAAAT CTATGCTGAA 


AAAACTTGGG AAGCCTTTTT 


TACCCGAGTC 


AACCGCACCA 


3480 
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TGAATTTGGT GCAGCCACTG GTTTTTATCT TTGTGGCACT GATTATCGTT TTACTTTATG 3540 

CGGCAATGCT CATGCCCATG TATCAAAATA TGGAGGTAAA TTTTTAAAAT GAAAAAAATG • 3600 

ATGACATTCT TGAAAAAAGC TAAGGTTAAA GCTTTTACAT TGGTGGAGAT GTTGGTGGTC 3660 

TTGCTGATTA TCAGCGTGCT TTTCTTGCTC TTTGTACCTA ATCTGACCAA GCAAAAAGAA 3720 

GCAGTCAATG ACAAAGGAAA AGCAGCTGTT GTTAAGGTGG TGGAAAGCCA GGCAGAACTT 3780 

TATAGCTTAG AAAAGAATGA AGATGCTAGC CTAAGAAAGT . TACAAGCAGA TGGACGCATC 3840 

ACGGAAGAAC AGGCTAAAGC TTATAAAGAA TACAATGATA AAAATGGAGG AGCAAATCGT 3900 

AAAGTCAATG ATTAAGGCCT TTACCATGCT GGAAAGTCTC TTGGTTTTGG GACTTGTGAG 3960 

TATCCTTGCC TTGGGCTTGT CCGGCTCTGT CCAGTCCACT TTTTCAGCGG TAGAGGAACA 4020 

GATTTTCTTT ATGGAGTTTG AAGAACTCTA TCGGGAAACC CAAAAACGCA GTGTAGCCAG 4080 

TCAGCAAAAG ACTAGTCTGA ACTTAGATGG GCAGACGCTT AGCAATGGCA GTCAAAAGTT 4140 

GCCAGTCCCT AAAGGAATTC AGGCCCCATC AGGCCAAAGT ATTACATTTG ACCGAGCTGG 4200 

GGGCAATTCG TCCCTGGCTA AGGTTGAATT TCAGACCAGT AAAGGAGCGA TTCGCTATCA 4260 

ATT AT AT CT A GGAAATGGAA AAATTAAACG CATTAAGGAA ACAAAAAATT AGGGCAGTGA 4320 

TTTTACTGGA AGCAGTAGTC GCTCTAGCTA TCTTTGCCAG CATTGCGACC CTCCTTTTGG 4380 

GACAAATTCA AAAAAATAGG CAAGAGGAAG CAAAAATCTT GCAAAAGGAA GAAGTCTTGA 4440 

GGGTAGCTAA GATGGCCCTG CAGACGGGGC AAAATCAGGT AAGCATCAAC GGAGTTGAGA 4500 

TTCAGGTATT TTCTAGTGAA AAAGGATTGG AGGTCTACCA TGGTTCAGAA CAGTTGTTGG 4560 

CAATCAAAGA GCCATAAGGT CAAGGCTTTT ACCTTGTTAG AATCCCTGCT TGCCCTCATT 4620 

GTCATCAGTG GGGGATTACT CCTTTTTCAA GCTATGAGTC AGCTCCTCAT TTCAGAAGTT 4680 

CGCTACCAGC AACAAAGCGA GCAAAAGGAG TGGCTCTTGT TTGTGGACCA ACTTGAGGTA 4740 

GAATTAGACC GTTCGCAGTT CGAAAAAGTA GAAGGCAATC GCCTATAGAT GAAGCAAGAT 4800 

GGCAAGGACA TCGCCATCGG TAAGTCAAAG TCAGATGATT TCCGTAAAAC GAATGCTCGT 4860 

GGTCGAGGTT ATCAGCCTAT GGTTTATGGA CTCAAATCTG TACGGATTAC AGAGGACAAT 4920 

CAACTGGTTC GCTTTCATTT CCAGTTCCAA AAAGGCTTAG AAAGGGAGTT CATCTATCGT 4980 

GTGGAAAAAG AAAAAAGTTA AGGCAGGTGT TCTCCTCTAC GCAGTCACCA TAGCAGCCAT . 5040 

CTTTAGTCTT TTGTTGCAAT TTTATTTGAA CCGACAAGTC GCCCACTATC AAGACTATGC 5100 

TTTGAATAAA GAAAAATTGG TTGCTTTTGC TATGGCTAAA CGAACCAAAG ATAAGGTTGA 5160 

GCAAGAAAGT GGGGAACAGT TTTTTAATCT AGGTCAGGTA AGCTATCAAA ACAAGAAAAC 5220 
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TGGCTTAGTG 


ACGAGGGTTC 


GTACGGATAA GAGCCAATAT 


GAGTTTCTGT 


TTCCTTCAGT 


5260 


CAAAATCAAA 


GAAGAGAAAA 


GAGATAAAAA 


GGAAGAGGTA 


GCGACCGATT 


CAAGCGAAAA 


5340 


AGTGGAGAAG 


AAAAAATCAG 


AAGAGAAGCC 


TGAAAAGAAA 


GAGAATTCAT 


AGTCAATTCA 


34UU 


ACTATAATGC 


GTTGAATCCA 


GAATAGTCCA 


CTGTAGTTTC 


TAGAAAATTG CTGGAAATGG 




ATGTTAAGCT 


CCAATTCATT 


TGTTTATATC 


TTATTTCAGT 


TTACTATACT 


TTGTGCTAAA 




TTAAAGATAT 


GAAACATGAT TTTAACCACA AAGCAGAAAC 


TTTCGATTCC 


CCTAAAAATA 


5580 


TCTTCCTCGC 


AAACTTGGTA 


TGTCAAGCAG 


CCGAGAAACA 


GATTGATCTT 


CTATCAGACA 


5640 


AAGAAATTTT AGATTTCGGT GGTGGCACGG GTCTATTAGC 


CTTGCCCCTA 


ACCGCTAGCC 


5700 


AAGCAGGCTA 


AGTCAGTCAC 


TCTTGTAGAC 


ATTTCTGAGA 


AAATGTTGGA 


GCAAGCTCGT 


5760 


TTGAAAGTGG 


AGCAGCAAGC 


AATCAAGAAT 


ATCCAGTTTT 


TGGAGCAAGA 


TTTACCGAAA 


5820 


AATCCCTTGG 


AGAAAGAGTT 


TGATTGCCTT 


GCTGTTAGTC 


GGGTTCTTCA 


TCATATGCCT 


5880 


GATTTGGATG 


CGGCTCTCTC 


ACTGTTTCAT 


CAACATTTGA 


AGGAAGATGG 


GAAACTCATC 


5940 


ATTGCTGATT 


TTACCAAGAC 


AGAAGCTAAT 


CATCATGGAT 


TTGATTTAGC 


TGAACTGGAA 


6000 


AACAAGCTAA 


T'lvi&fjr'A'pnn 

X. 1 \Jt\**3\r** 1 v\> 


11111 l,n 1 V>1 


GTGCATAGTC 


AGATTCTCTA 


TAGTGCTGAA 


6060 


GACCTGTTTC 


AAGGAAATCA 


CTCAGAATTC 


TTTTTAATAG 


TAGCCCAAAA 


ATCACTCGCC 


6120 


TAGTCAGGGA 


GTGATTTTTC 


TATAAGGATG 


GAAAAAAGAA 


GGGAAATTTG 


GTAAGATAGG 


6180 


AATATGGATT 


TTGAAAAAAT 


TGAACAAGCT 


TATACCTATT 


TACTAGAGAA 


TGTCCAAGTC 


6240 


ATCCAAAGTG 


ATTTGGCGAC 


CAACTTTTAT 


GACGCCTTGG 


TGGAGCAAAA 


1 AuV. A IViAT 


6300 


CTGGATGGTG 


AAACTGAGCT 


AAACCAGGTC 


AAGGAGAACA ATCAAACCCT 


TAAGCGTTTA 


- 6360 


GCACTACGCA 


AAGAAGAATG 


GCTCAAGACC 


TACCAGTTTC 


TCTTGATGAA 


GGCTGGGCAA 


6420 


ACAGAACCCT 


TGCAGGCCAA 


TCACCAGTTT 


ACACCGGATG 


CTATTGCTTT 


GCTTTTGGTG 


6480 


TTTATTGTGG 


AAGAGTTGTT 


TAAAGAGGAG 


GAAATTACTA 


TCCTCGAAAT 


GGGTTCTGGG 


6540 


ATGGGAATTC 


TAGGCGCTAT 


TTTCTTGACC 


TCGCTTACTA 


AAAAGGTGGA 


TTACTTGGGA 


6600 


ATGGAAGTGG 


ATGATTTGCT 


GATTGATCTG 


GCAGCTAGCA 


TGGCAGATGT 


AATTGGTTTG 


6660 


CAGGCTGGCT 


TTGTCCAAGG 


AGATGCCGTT 


CGCCCACAAA 


TGCTCAAAGA 


AAGCGATGTG 


6720 


GTCATCAGTG 


ACTTGCCTGT 


CGGCTATTAT 


CCTGATGATG 


CCGTTGCGTC 


GCGCCATCAA 


6780 


GTTGCTTCTA 


GCCAAGAACA 


TACTTACGCC 


CATCACTTGC 


TCATGGAACA 


AGGGCTTAAG 


6840 


TACCTCAAGT 


CAGACGGATA 


CGCTATTTTT 


CTAGCTCCGA 


GTGATTTGTT 


GACCAGTCCT 


6900 


CAAAGTGATT 


TGTTAAAAGA 


ATGGCTGAAA 


GAAGAGGCGA 


GTCTGGTTGC 


TATGATTAGT 


6960 


CTGCCTGAAA 


ATCTCTTTGC 


TAATGCCAAA 


CAATCTAAGA 


CTATTTTTAT 


CTTACAGAAG 


7020 
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AAAAATGAAA TAGCAGTAGA GCCTTTTGTT TATCCACTTG CTAGCTTGCA 


AGATGCAAGT 


7080 


GTTTTAATGA AATTTAAAGA AAATTTTCAA AAATGGACTC AAGGTACTGA 


AATATAAAAT 


7140 


AGATTTTGTT 


ATAATAGTTG 


AAAACGCTTA 


AAAAGGGGTA TCATGTTATG 


ACAAAAACAA 


7200 


TTGCAATCAA TGCAGGAAGT TCAAGTTTGA AATGGCAATT ATACTTAATG 


CCAGAAGAAA 


7260 


AAGTATTGGC 


GAAAGGTTTG 


ATTGAACGTA 


TCGGTTTGAA AGATTCAATT 


TCAACTGTAA 


7320 


AATTTGACGG 


CCGTTCTGAA 


CAACAAATTT 


TGGATATTGA AAATCATATA 


CAAGCCGTTA 


7380 


AAATTTTATT 


GGATGACTTG 


ATTCGTTTCG 


ATATTATCAA GGCTTATGAC 


GAGATTACAG 


7440 


GTGTTGGACA 


TCGTGTTGTT 


GCTGGTGGAG 


AATATTTCAA AGAATCAACA 


GTTGTTGAGG 


7500 


GAGATGTTTT 


AGAAAAAGTT 


GAAGAGTTGA 


GTTTGTTGGC TCCTCTACAC 


AACCCGGCCA 


7560 


ATGCAGCAGG TGTTCGTGCC TTCAAGGAAT TGTTGCCAGA CATTACCAGT 


GTAGTTGTTT 


7620 


TTGATACTTC CTTCCACACA AGTATGCCAG AGAAAGCTTA TCGCTACCCT 


CTACCAACAA 


7680 


AATATTACAC 


AGAAAACAAG 


GTTCGTAAAT 


ACGGTGCTCA TGGTACAAGT 


CACCAGTTTG 


7740 


TAGCAGGAGA 


AGCTGCAAAA 


CTCTTGGGAC 


GTCCATTAGA AGACTTGAAG 


TTAATTACCT 


7800 


GTCATATTGG 


TAACGGAGGC 


TCAATTACAG 


CTGTGAAAGC CGGCAAATCT 


GTAGACACTT 


7860 


CTATGGGGTT 


CACTCCTCTT 


GGTGGTATTA 


TGATGGGAAC GCGTACAGGG 


GATATTGATC 


7920 


CAGCTATCAT 


TCCTTATTTA 


ATGCAATATA 


CAGAGGATTT TAACACACCA 


GAAGATATCA 


7980 


GTCGTGTTCT 


TAACCGTGAA 


TCAGGTCTTT 


TGGGAGTTTC TGCTAATTCT 


AGCGATATGC 


8040 


GCGATATAGA 


AGCAGCTGTA 


GCAGAAGGGA 


ATCACGAGGC TAGCTTGGCT 


TATGAAATGT 


8100 


ATGTTGACCG 


TATCCAAAAA 


CATATCGGTC 


AGTACCTTGC AGTGCTAAAT 


GGAGCAGATG 


8160 


CCATTGTTTT 


CACAGCAGGT 


GTCGGTGAAA 


ATGCAGAGAG TTTCCGTCGT 


GATGTAATCT 


8220 


CAGGGATTTC 


GTGGTTTGGT 


TGTGATGTTG 


ATGATGAAAA GAATGTCTTT 


GGCGTTACAG 


8280 


GAGACATCTC AACAGAGGCA GCTAAAATCG GTGTCTTGGT TATTCCAACA GATGAAGAAT 


8340 


TAGTCATTGC 


CCGTGACGTT 


GAACGCTTGA 


AAAAATAAGT GAAACTAAAA 


AAATATTCAA 


8400 


TACAAGGAGT 


TGGGAAAGTT 


ATTTTTCCAG 


CTTCTTTTTC TGATGAAATT 


GTCCAAAACC 


8460 


TTGCTATGAT 


TGGCTTTTTT 


GAAAAATATG 


GTATAATAGT AGTAATTTAA 


TAGATGGAGT 


8520 


TGAGTTTTGA AGAAAAACTT TCGTGTAAAA AGAGAGAAAG ATTTTAAGGC GATTTTCAAG 


8580 


GAGGGGACAA 


GTTTTGCTAA 


TCGCAAATTT 


GTGGTCTACC AATTAGAAAA 


CCAGAAAAAC 


8640 


CGTTTTCGAG 


TAGGTCTATC 


AGTTAGCAAA 


AAACTGGGGA ATGCCGTCAC 


TAGAAATCAA 


8700 


ATTAAGCGAC 


GGATTCGGCA 


TATTATCCAG 


AATGCAAAAG GGAGTCTGGT 


AGAAGATGTC 


8760 
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GACTTTGTTG TCATTGCTCG AAAAGGAGTC GAAACCTTGG GATACGCAGA GATGGAGAAA 8820 

AATCTACTCC ATGTATTAAA ATTATCAAAG ATTTACCGGG AAGGAAATGG GAGTGAAAAA 8880 

GAAACTAAAG TTGACTAGTT TGCTAGGACT GTCTCTGTTA ATCATGACAG CCTGTGCGAC 8940 

TAATGGGGTA ACTAGCGATA TTACAGCCGA ATCGGCTGAT TTTTGGAGTA AATTGGTTTA 9000 

CTTCTTTGCG GAAATCATTC GCTTTTTATC GTTTGATATT AGTATCGGAG TGGGGATTAT 9060 

TCTCTTTACG GTCTTGATTC GTACAGTCCT CTTGCCAGTC TTTCAGGTGC AAATGGTGGC 9120 

TTCTAGGAAA ATGCAGGAAG CTCAGCCACG CATTAAGGCG CTTCGAGAAC AATATCCAGG 9180 

TCGAGATATG GAAAGCAGAA CCAAACTAGA GCAGGAAATG CGTAAAGTAT TTAAAGAAAT 9240 

GGGTGTCAGA CAGTCAGACT CTCTTTGGCC GATTTTGATT CAGATGCCGG TTATTTTGGC 9300 

CCTGTTCCAA GCCCTATCAA GAGTTGACTT TTTAAAGACA GGTCATTTCT TATGGATTAA 9360 

CCTTGGTAGT GTGGATACAA CCCTTGTTCT TCCGATTTTA GCAGCAGTAT TCACCTTTTT 9420 

AAGTACTTGG TTGTCCAACA AAGCTTTGTC TGAGCGAAAT GGCGCTACGA CTGCGATGAT 9480 

GTATGGGATT CCAGTCTTGA TTTTTATCTT TGCAGTTTAT GCGCCAGGTG GAGTCGCCCT 9540 

ATACTGGACA GTGTCTAATG CTTATCAAGT CTTGCAAACC TATTTCTTGA ATAATCCATT 9600 

CAAGATTATC GCAGAGCGCG AGGCCGTAGT ACAGGCACAA AAAGATTTGG AAAATAGAAA 9660 

AAGAAAAGCC AAGAAAAAGG CTCAGAAAAC GAAATAAATA AGGAGGAATC TGGTAGTGGT 9720 

AGTATTTACA GGTTCAACTG TTGAAGAAGC AATCCAGAAA GGATTGAAAG AATTAGATAT 9780 

TCCAAGAATG AAGGCTCATA TCAAAGTCAT TTCTAGGGAG AAAAAAGGCT TTCTTGGTCT 9840 

ATTTGGTAAA AAACCAGCCC AAGTGGATAT TGAAGCGATT AGTGAAACGA CTGTTGTCAA 9900 

AGCAAATCAA CAGGTAGTAA AAGGCGTTCC GAAAAAAATC AATGATTTGA ACGAGCCTGT 9960 

GAAGACGGTT AGTGAAGAAA CCGTTGACCT TGGTCATGTG GTTGATGCTA TTAAAAAAAT 10020 

AGAGGAAGAA GGTCAAGGTA TTTCTGATGA AGTCAAGGCT GAAATCTTAA AACATGAAAG 10080 

ACATGCCAGC ACTATCTTAG AAGAAACTGG TCACATTGAG ATTTTAAATG AACTTCAAAT 10140 

CGAGGAAGCG ATGAGGGAAG AAGCAGGCGC TGATGACCTT GAAACTGAGC AAGACCAAGC 10200 

TGAAAGTCAA GAACTAGAAG ACTTGGGCTT GAAAGTTGAA ACGAACTTTG ATATTGAACA 10260 

AGTAGCTACG GAAGTAATGG CTTATGTTCA AACGATTATT GATGACATGG ATGTTGAGGC 10320 

TACACTTTCA AATGATTATA ACCGTCGTAG CATCAATCTA CAAATTGACA CCAACGAACC 10380 

AGGTCGTATT ATCGGCTACC ATGGTAAAGT CTTGAAGGCC TTGCAACTGT TGGCTCAAAA 10440 

TTATCTTTAC AACCGCTATT CCAGAACCTT CTACGTTACA ATCAATGTCA ATGATTATGT 10500 

CGAACACCGT GCAGAAGTCT TGCAGACCTA TGCGCAAAAA TTGGCGACTC GTGTTTTGGA 10560 
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AGAAGGGCGC 


AGTCATAAAA 


CAGATCCAAT 


GTCAAATAGC 


GAACGCAAGA 


TTATCCATCG 


10620 


TATTATTTCA CGTATGGATG 


GCGTGACTAG 


TTACTCTGAA 


GGTGATGAGC 


CAAATCGCTA 


10680 


TGTTGTTGTA 


GATACAGAAT 


AAGTAAAATC 


AGGTTTATCC 


TGATTTTTTG 


CTAGTTAGAG 


10740 


GAGGTTAAAC 


TGATGTTGAA 


TAAGATAAGA 


GACTATTTAG 


ACTTTGCTGG 


TTTGCAGTAC 


10800 


CGTAATCCTG 


ATAAAGCGGG 


AGCAGAGCGA 


GAGAAGATGC 


TGGCATTCCG 


CCACAAAGGA 


10860 


CAAGAGGCCC 


GAAAGGTTTT 


TACAGAACTG 


GCCAAAGCCT 


TTCAAGCAAG 


CCATCCAGAA 


10920 


TGGCAACTCC 


AACAGACTAG 


CCAGTGGATG 


AATCAGGCCC 


AGCGTTTGAG 


ACCAGATTTT 


10980 


TGGGTTTATC 


TACAGAGAGA 


CGGACAAGTG 


ACAGAACCTA 


TGATGGCCTT 


ACGTTTGTAT 


11040 


GGGACATCTA 


CTGACTTTGG 


AATTTCTTTG 


GAAGTCAGTT 


TCATCGAACG TAAGAAGGAT 


11100 


GAGCAAACAC 


TGGGCAAGCA 


GGCCAAAGTT 


TTAGACATTC 


CAACCGTTAA 


AGGGATTTAT 


11160 


TATCTAACCT 


ACTCTAATGG 


TCAAAGTCAA 


CGGTGGGAGG 


CGAATGAAGA 


AAAGCGTCGT 


11220 


ACTTTACGCG 


AGAAGGTGAG 


AAGTCAAGAA 


GTTCGAAAAG 


TTTTAGTGAA 


GGTAGATGTT 


11280 


CCTATGACAG 


AAAATTCGTC 


TGAAGAAGAA 


ATCGTAGAAG 


GCTTATTGAA 


GTCTTATTCT 


11340 


AAAATTCTTC 


CCTATTATCT 


AGCTACGAGA 


AAATAAGATA 


ATTTGTAAAA 


CATCATAAAT 


11400 


CATACAGTCC 


AAGAGTGAAC 


AGTCCGCTGT 


GTAATTCTTG 


GTCTTTTTGT 


TTGCGCTTTC 


11460 


GCATTATATA ATAAACTTAC AAAAACAATT CAAAAGGAGA ACAATTATGG 


AAGTCGTTTC 


11520 


AAGTGTTCTA 


AATTGGTTTT 


CTAGCAATAT 


TTTGCAGAAT 


CCCGCATTTT 


TCGTAGGTTT 


11580 


ATTGGTGTTG 


ATAGGATATG 


CACTTTTGAA 


AAAACCTGCC 


CATGACGTTT 


TTTCAGGGTT 


11640 


TGTTAAAGCA 


ACAGTAGGGT 


ATATGTTGCT 


TAACGTGGGT 


GCTGGTGGTT 


TGGTTACAAC 


11700 


CTTTCGTCCA ATCTTAGCAG CTCTTAACTA CAAATTCCAA ATTGGTGCAG CGGTTATCGA 


11760 


CCCTTACTTT 


GGACTTGCTG 


CAGCAAACAA 


CAAAATTGTA GCAGAGTTTC 


CAGATTTTGT 


11820 


TGGAACTGCA ACTACAGCTC 


TATTGATTGG 


TTTTGGAATA 


AATATCTTGC 


TCGTAGCTCT 


11880 


TCGAAAGATT 


ACGAAGGTAA 


GAACCCTCTT 


TATTACTGGT 


CACATCATGG 


TACAACAAGC 


11940 


TGCAACAGTA TCTCTTATGG TTCTATTCTT AGTACCACAA TTGCGCAATG 


CTTACGGTAC 


12000 


AGCAGCGATT 


GGTATCATCT 


GTGGACTTTA 


CTGGGCAGTT 


AGTTCAAATA 


TGACTGTTGA 


12060 


GGCAACTCAA CGCTTGACTG 


GTGGTGGCGG 


ATTTGCGATT 


GGTCACCAAC 


AGCAATTTGC 


12120 


AATCTGGTTT 


GTAGATAAAG 


TAGCAGGACG 


CTTTGGTAAG 


AAAGAAGAAA 


GTTTAGACAA 


12180 


TCTTAAATTA 


CCTAAGTTCC 


TCTCAATCTT 


CCACGATACA 


GTTGTTGCAT 


CTGCTACCTT 


12240 


GATGCTCGTA 


TTCTTCGGAG 


CCATTCTTTT 


AATCTTGGGT 


CCAGACATTA 


TGTCTAATAA 


12300 



WO 98/18931 



PCT/US97/19588 



382 



AGAAGTCATC 


ACTTCAGGAA 


CTCTATTCAA 


TCCTGCTAAA 


CAAGATTTCT 


TTATGTACAT 


12360 


TATCCAAACA 


GCCTTTACCT 


TCTCAGTTTA 


CTTGTTCGTT 


TTGATGCAAG 


GTGTCCGAAT 


12420 




GAGTTGACAA ACGCCTTCCA AGGTATTTCA AACAAATTGT TGCCAGGTTC 


12480 




GTTGACGTTG 


CAGCTTCTTA 


TGGATTTGGT 


TCTCCAAATG 


CTGTCTTGTC 


12S40 




TTTGGTTTGA 


TTGGTCAATT 


GATTACAATT 


GTTTTGCTCA 


TCGTCTTTAA 


12600 




CTTATTATTA 


CAGGATTTGT 


ACCAGTGTTC 


TTTGACAATG 


CAGCCATTGC 


12660 


yrtj 1 k» 1 AIAjI. 1 


GATAAACGCG 


GCGGATGGAA AGCGGCTGTT 


ATCCTTTCCT 


TTATATCAGG 


12720 


TGTCCTTCAA 


GTTGCTCTAG 


GAGCTCTTTG 




CTCGATTTGG 


CATCTTATGG 


12780 


I Ut*t_ 1 ACl~ AT 


GGAAATATCG 


ACTTTGAATT 


CCCATGGCTT 


GGATTTGGAT 


ATATCTTCAA 


12840 


A L Al_t- i iubT 


ATTGTTGGTT 


ATGTACTTGT 


GTGTCTCTTC 


TTG CTTGTT A 


TTCCTCAACT 


12900 


I\,AATTTGCC 


AAAGCAAAAG 


ATAAAGAGAA 


ATATTACAAC 


GGTGAAGTTC 


AAGAAGAAGC 


12960 


TTAGTATCTA 


GAAAAGGAGA 


AATAAAATGG 


TTAAAGTATT 


AGCAGCGTGC 


GGAAATGGAA 


13020 


TGGGTTCATC 


AATGGTTATC 


AAGATGAAGG 


TTGAAAATGC 


TCTCCGTAAG 


CTTAATCAAA 


13060 


CAGATTTTAC 


AGTCAATTCA 


TGCAGTGTCG 


GTGAAGCTAA 


AGGTTTAGCA 


GTAGGATATG 


13140 


ACATCGTAAT 


CGCTTCTCTT 


CATTTGATTC 


AAGAATTGGA 


AGGGCGAACT 


AATGGGAAGT 


13200 


1 AAI 1ajUC»Q.T 


TGATAACTTG 


ATGGATGATA 


AAGAAATCAC 


CGAAAAACTC 


AGTCAAGCAC 


13260 


rTT TV TV Am TV TV TV TV 

rACAGTAAAA 


GGTTGGAGGG 


GGCTGGACAG 


AAACTGAGAG 


TTATCGTTTC 


TGTCCTTCTC 


13320 


CCTCTTTAAA 


TAAAGGAGGC 


AGATATGAAT 


TTAAAACAAG 


CTTTAATTGA 


CAATGACTCG 


13380 


tv fr>/^ r*f* ji/*»*r»n/* 
A 1 L(.UAL i AO 


/"fnrn m 7i f ik r*/r* 

\3l 1 l AUUaLlL. 


TAACAATTGG 


AAAGAAGCAG 


TCAAGGTAGC 


AGTAGATCCC 


13440 


1 4 AAI 1\jAAA 


GTGGGGCAAT 


TTTGCCAGAG 


TATTACGATG 


CTATCATTGA 


ATCGACTGAA 


13500 


unu 1 A 1 uuul. 


CTTACTATAT 


CTTGATGCCA 


GGTATGGCTA 


TGCCCCACGC 


TAGACCTGAA 


13560 


GCAGGTGTGC 


AAAGTGATGC 


CTTTTCATTG 


ATTACCTTAC 


AAAATCCTGT 


TGTATTTTCA 


13620 


GATGGGAAAG 


AGGTATCTGT 


TTTGTTGGCA 


CTAGCAGCAA 


CAAGTTCAAA 


AATTCACACA 


13680 


AGTGTAGCCA 


TTCCACAAAT 


TATTGCCCTA 


TTTGAATTAG 


AAGATTCTAT 


TGCACGTTTA 


13740 


CAGGCTTGCC 


AGACTAAAGA 


AGATGTCTTG 


GCTATGATTG 


AAGAATCTAA 


GGATAGCGCT 


13800 


TATCTCGAAG 


GATTGGATTT 


GGAAAGTTAG 


AAAGAGGAAT 


AAAGAAATGA 


CAAAAAGAAT 


13860 


ACCTAATTTA 


CAAGTTGCAT 


TAGACCATTC 


AGACTTGCAA 


GGAGCGATTA 


AAGCAGCTGT 


13920 


TTCTGTTGGT 


CAGGAAGTAG 


ATATTATCGA 


AGCTGGAACT 


GTTTGCTTGC 


TTCAAGTTGG 


13980 


AAGTGAACTG 


GCTGAAGTCT 


TGCGTAGCCT 


TTTCGCAGAT 


AAGATTATTG 


TGGCAGACAC 


14040 


AAAATGTGCT 


GATGCTGGTG 


GAACAGTTGC 


TAAAAATAAT 


GCGGTTCGTG GAGCAGACTG 


14100 
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GATGACTTGT ATCTGTTGTG CAACCATCCC TACTATGGAA GCAGCTCTAA AGGCTATCAA 14160 

GACTGAACGA GGAGAACGAG GCGAAATCCA GATCGAGCTT TATGGCGATT GGACTTTTGA 14220 

ACAAGCTCAG CTTTGGCTAG ATGCAGGTAT CTCACAAGCT ATTTATCACC AATCTCGTGA 14280 

TGCTCTTCTT GCTGGTGAAA CTTGGGGTGA AAAAGACCTT AATAAGGTTA AAAAACTCAT 14340 

TGACATGGGC TTCCGTGTAT CTGTAACAGG TGGTCTAGAT GTAGATACTC TCAAACTCTT 14400 

TGAAGGTATT GATGTCTTTA CCTTTATCGC AGGTCGTGGA ATTACAGAGG CTGTGGATCC 14460 

AGCAGGAGCA GCGCGTGCCT TCAAGGATGA AATCAAACGA ATTTGGGGGT AAATCATGGT 14520 

ACGTCCAATT GGAATTTATG AAAAGGCAAG CCCAACACAC TGTACTTGGC TAGAACGTTT 14580 

AAATTTTGCC AAGGAGTTAG GCTTTGATTT TGTCGAGATG TCTATTGAGG AACGTGACGA 14640 

GCGTTTAGCA AGACTTGACT GGAGTAAGGA AGAACGCTTG GAAGTTGTCA AAGCAATCTA 14700 

TGAAACTGGT GTTCGTATTC CTTCTATCTG TTTTTCAGGC CATCGTCGCT ACCCATTGGG 14760 

TTCAAAAGAT CCAGTTCTAG AGGAAAAATC TCTAGAACTC ATGAAAAAAT GTATCGAATT 14820 

AGCTCAAGAC TTGGGAGTTC GTACGATTCA ATTAGCTGGT TACGATGTTT ACTATGAGGA 14880 

AAAGTCACCC CAGACACGCC AACGTTTTAT CAAAAATTTG AGAAAAGCCT GTGACTGGGC 14940 

TGAAGAAGCT CAGGTGGTAC TTGCTATTGA AATTATGGAT GATCCTTTCA TCAGTAGGAT 15000 

CGAAAAATAT TTGGCTATAG AAAAAGAGAT TGACTCTCCC TTCGTCTTTG TATATCCAGA 15060 

TATTGGTAAT GTGTCTGCAT GGCATAATGA TATCTATAGT GAGTTTTATC TTGGTCATCA 15120 

TGCCATCGCA GCTCTCCATC TCAAGGATAC TTATGCAGTG AGAGAAAGTT GAAAGGGCGA 15180 

GTTCCGAGAT GTACCTTTCG GGCAAGGTTG TGTCAAATGG GAAGAAGCTT TCGATATTTT 15240 

AAAGGAAACC AATTATAATG GACCTTTCCT AATCGAAATG TGGTCTGAAA ATTGTGAAAC 15300 

AGTAGAAGAA ACACGCGCAG CCATTCAAGA GGCGCAAGCT TTTCTCTATC CACTCATTAA 15360 

GAAAGCAGGT TTGATGTAAG ATGAATCAAG TAATCAATGC TATGCGTAAA CGAGTCTGTG 15420 

ATGCCAATCA ATCATTGCCA AAACATGGAC TTGTCAAATT TACCTGGGGG AATGTATCTG 15480 

AAGTTAATCG CGAACTCGGT GTCATTGTTA TCAAACCATC AGGCGTGGAT TATGACGAAT 15540 

TGACACCTGA AAACATGGTA GTGACTGATC TAGATGGTAA GATCCTAGAA GGGGATTTAA 15600 

GACCATCTTC CGACCTCCCA ACTCATGTGC AATTATATAA GACTTGGTCA GAAATTGGTA 15660 

GTGTGGTTCA CACCCATTCG ACAGAAGCTG TTGGTTGGGC TCAGGCAGGT CGTGATATTC 15720 

CTTTCTACGG AACAACCCAT GCAGATTATT TCTACGGTTC AATCCCTTGC GCCCGTAGTT 15780 

TGACCAAGGA CGAAGTAGAA GTGGCCTATG AAAAAGATAG TGGCCTGGTT ATCGTAGAAG 15840 
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AGTTTGAACA TCGCGGACTT AACCCGGTTG AAGTACCAGG AATTGTTGTA CGCAATCACG 15900 

GTCCATTCAC CTGGGGCAAA AATCCAGAGA ATGCTGTTTA TCACTCTGTC GTACTAGAGG 15960 

AAGTATCAAA GATGAATCGC TTTACAGAAC AAATCAATCC AAGAGTTGGA CCTGCTCCCC 16020 

AGTACATACT AGAAAAACAC TACCAACGTA AACATGGACC AAATGCTTAT TATGGTCAAA 16080 

AGTAAGAACG ATGAAGGAGG AGAAAAAGAT AAATTTAGCT CCTCTTTTTA CATTTGATTT 16140 

TTATTGAGAG TAAAGTTGGA GTTGAAGTAA TTTTAAAAGA TTTTTTAGAA ATAGCGCTTG 16200 

ATATATATAT GGTAAAATAA AAAGAATTGC TGTGATATCA ATAGATTTGG GGGATTTTTT 16260 

AATATGGTAC TGGATAAGGC AAGTTGTGAT TTGCTTCAAT ATTTGATGGA TCAAGAAACG 16320 

TCCAAAACGA TTATGGCGAT TTCGAAAGAT TTGAAAGAGT CAAGAAGGAA AATTTATTAT 16380 

CACATTGACA AAATCAATGC TGCTCTGGGT GACGAGGCGC TTCACATCAT TAGTATTCCA 16440 

CGAATTGGTA TTCACTTAAC GGAAGAGCAG AGAGATGCTT GTTGTAAACT ATTATCGGAA 16500 

GTAGATTCGT ACGATTATAT CATGAGTGCG CATGAACGTA TGATGATAAT GTTACTATGG 16560 

ATAGGTATTT CTAAAGAACG TATTACGATT GAAAAATTGA TAGAGTTAAC AGAGGTATCT 16620 

AGGAATACTG TTCTCAATGA TTTGAATAGT ATTCGTTATC AACTAACTTT GGAACAATAT 16680 

CAGGTGATCT TGCAAGTGAG CAAGTCACAG GGATACAACC TTCATGCCCA CCCTCTTAAT 16740 

AAAATTCAGT ATCTTCAATC GCTTCTATAT CATATTTTTA TGGAAGAAAA TGCCACTTTT 16800 

GTATCTATTT TAGAAGATAA GATGAAAGAG AGGTTAGATG ATGAGTGTTT GCTTTCTGTT 16860 

GAAATGAACC AATTTTTTAA GGAACAGGTT CCTTTAGTTG AACAAGATTT AGGGAAGAAA 16920 

ATAAACCATC ATGAAATAAC TTTTATGTTG CAGGTTCTAC CTTATTTGCT GTTAAGCTGT 16980 

CATAATGTTG AACAGTATCA AGAAAGACAT CAGGATATAG AGAAAGAATT TTCTTTGATA 17040 

AGAAAAAGAA TAGAGTATCA GGTGTCTAAG AAATTAGGAG AACGGTTGTT TCAAAAGTTT 17100 

GAAATTTCTT TGTCAGGACT TGAAGTTTCT CTTGTAGCTG TTCTCCTCCT CTCCTATCGT 17160 

AAAGATTTGG ATATTCATGC AGAAAGTGAT GATTTTCGGC AATTAAAACT TGCTTTAGAA 17220 

GAATTTATCT GGTATTTTGA ATCACAAATC CGAATGGAGA TTGAGAACAA GGATGATTTG 17280 

TTACGAAATT TGATGATCCA CTGTAAAGCC TTGTTATTTA GAAAGACTTA CGGTATTTTT 17340 

TCTAAAAATC CTCTAACAAA ACAAATTCGA TCCAAGTATG GAGAATTATT TTTAGTCACT 17400 

AGAAAATCTG CGGAAATTTT AGAAGGAGCA TGGTTTATTC GGCTAACAGA CGATGATATT 17460 

GCCTATTTGA CGATTCATAT TGGAGGATTT TTAAAATATA CACCATCATC TCAAAAAAAT 17520 

ATGAAAAAAG TTTATCTCGT TTGTGATGAA GGTGTTGCGG TTTCGAGACT TTTGCTGAAA 17580 

CAATGCAAAC TTTATTTTCC AAATGAGCAA ATTGACACTG TATTTACAAC AGAACAATTT 17640 
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AAGAGTGTGG AAGATATTGC 


ACAAGTTGAT 


GTAGTGATTA 


CTACTAATGA 


TGATTTGGAT 


17700 


AGCAGATTTC CGATTTTAAG 


GGTTAATCCT ATCCTTGAAG 


CAGAAGATAT 


TTTGAAAATG 


17760 


CTAGACTATC TTAAACACAA 


TATATTTCGT 


AATAAGAGCA 


AAAGTTTCAG 


TGAAAATCTT 


17820 


TCTAGTCTTA TTTCGTCTTA 


TATTGTAGAC 


AGCAAGTTGG 


CTAGTAAGTT 


CCAAGAAGAG 


17880 


GTTCAAACAC TTATAAATCA 


AGAAATAGTA 


GTTCAAGCTT 


TTTTGGAAGr 


TATTTGAAGG 


17940 


ACAGTCCAAT GATGAACACA 


AACCTGTGTk 


TTTCsTGGTC 


TTTTtTAGTG 


TTTTGAAGGG 


18O00 


TGGkATACTA ATCTCAAAGA 


TAACAATTAT 


ATCCAAAGGA 


GGCAACATAT 


GCCAAACGTC 


18060 


AAAGAAATTA CAAGAGAGTC 


ATGGATTTTA 


GCCACTTTCC 


CAGAGTGGPG 


AACATGGTTG 


18120 


AACGAAGAAA TCGAAGAAGA 


AGTCGTACCT 


GAAGGCAACT 


TTGCCATGTG GTGGCTAGGC 


18180 


AACTGTGGTA CTTGGATTAA 


GACACCAGCT 


GGTGCTAACG 


TTGTCATGGA 


CCTTTGGTCA 


18240 


AACCGTGGAA AATCAACCAA 


AAAAGTGAAA 


GATATGGTTC 


GTGGGCACCA 


AATGGCAAAT 


18300 


ATGGCAGGTG TTCGTAAGCT 


GCAACCAAAC 


TTGCGTGTTC 


AGCCAATGGT 


TATCGATCCA 


18360 


TTTGCTATCA ACGAACTAGA 


CTATTACTTA 


GTTTCACACT 


TCCACAGTGA 


TCATATCGAC 


16420 


CCATACACAG CTGCAGCAAT 


TCTCAATAAT 


CCTAAGTTAG 


AGCATGTTAA 


GTTGG 


18475 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7186 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCAGGATTTG GTACCGTTGC AAGTGGTGTG CCTTTCCTCC TAAAGGAAAA TGGAGGAAAA 60 

ATCAATCAAT CAGCACATTC AGATATCAAA GTTGCTAAGG TATTGGTCAA GGATGAAGAT 120 

GAAAAAAATC GCTTGCTTGC AGCAGGGAAT GACTTTAACT TTGTAACCAA TGTGGATGAT 180 

ATTTTATCAG ACCAGGATAT TACTATCGTA GTGGAATTGA TGGGGCGTAT TGAGCCTGCT 240 

AAAACCTTTA TCACTCGTGC CTTGGAAGCT GGAAAACACG TTGTTACTGC TAACAAGGAC 300 

CTTTTAGCTG TCCATGGCGC AGAATTGCTA GAAATCGCTC AAGCTAACAA GGTAGCACTT 360 

TACTACGAAG CAGCAGTTGC TGGTGGGATT CCAATTCTTC GTACTTTAGC AAATTCCTTG 420 

GCTTCTGATA AAATTACGCG CGTGCTTGGA GTAGTCAACG GAACTTCCAA CTTCATGGTG 480 

ACCAAGATGG TGGAAGAAGG CTGGTCTTAC GATGATGCTC TTGCGGAAGC ACAACGTCTA 540 
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1 1 »>j^_/\vj A/u«ji.uAr*-t (xAvajAATGAC GTAGATGGGA TTG ATG C AG C 


CTACAAGATG 


600 


iaj. 1 1 n*A <jL L AA TTTGC ctttggcatg aagattgcct ttgatgatgt 


AGCCCACAAG 


660 


iasaaa\.ia>la AT AT CACACC AGAAGACGTA GCTGTAGCTC AAGAGCTTGG 


TTACGTAGTG 


720 


/u\Airfet>i i\a GTTCTATTGA GGAAACTTCT TCAGGTATTG CTGCAGAAGT 


GACTCCAACC 


780 


TTCCTACCTA AAGCGCACCC ACTTGCTAGT GTGAATGGCG TAATGAACGC 


TGTCTTTGTA 


840 


GAATCTATCG GTATTGGTGA GTCTATGTAC TACGGACCAG GTGCGGGTCA 


AAAACCAACT 


900 


GCAACAAGTG TTGTAGCTGA TATTGTCCGT ATCGTTCGTC GTTTGAATGA 


TGGTACTATT 


960 


GGCAAAGACT TCAACGAATA TAGCCGTGAC TTGGTCTTGG CAAATCCTGA 


AGATGTCAAA 


1020 


GCAAACTACT ATTTCTCAAT CTTGGCTCTA GACTCAAAAG GTCAGGTCTT 


GAAGTTGGCT 


1080 


GAAATCTTCA ATGCTCAAGA TATTTCCTTT AAGCAAATCC TTCAAGATGG 


CAAAGAGGGT 


1140 


GACAAGGCGC GTGTCGTTAT CATCACACAC AAGATTAATA AAGCCCAGCT 


TGAAAATGTC 


1200 


TCAGCTGAAT TGAAGAAGGT TTCAGAATTC GACCTCTTGA ATACCTTCAA 


GGTGCTAGGA 


1260 


GAATAAGATG AAGATTATTG TACCTGCAAC CAGTGCCAAT ATCGGGCCAG 


GTTTTGACTC 


1320 


GGTCGGTGTA GCTGTAACCA AGTATCTTCA AATTGAGGTC TGCGAAGAAC 


GAGATGAGTG 


1380 


GCTGATTGAA CACCAGATTG GCAAATGGAT TCCACATGAC GAGCGTAATC 


TCTTGCTCAA 


1440 


AATCGCTTTG CAAATTGTAC CAGACTTGCA ACCAAGACGC TTGAAAATGA 


CCAGTGATGT 


1500 


CCCTTTGGCG CGCGGTTTGG GTTCTTCCAG CTCGGTTATC GTTGCTGGGA 


TTGAACTAGC 


1560 


CAACCAACTG GGTCAACTCA ACTTATCAGA CCATGAAAAA TTGCAGTTAG 


CGACCAAGAT 


1620 


TGAAGGGCAT CCTGACAATG TGGCTCCAGC CATTTATGGT AATCTCGTTA 


TTGCAAGTTC 


1680 


TGTTGAAGGG CAAGTCTCTG CTATCGTAGC AGACTTTCCA GAGTGTGATT 


TTCTAGCTTA 


1740 


CATTCCAAAC TATGAATTAC GTACTCGCGA CAGCCGTAGT GTCTTGCCTA 


AAAAATTGTC 


1800 


TTATAAGGAA GCTGTTGCTG CAAGTTCTAT GGCCAATGTA GCGGTTGCTG 


CCTTGTTGGC 


1860 


AGGAGACATG GTGACCGCTG GGCAAGCAAT CGAGGGAGAC CTCTTCCATG 


AGCGCTATCG 


1920 


TCAGGACTTG GTAAGAGAAT TTGCGATGAT TAAGCAAGTG ACCAAAGAAA 


ATGGGGCCTA 


1980 


TGCAACCTAC CTTTCTGGTG CTGGGCCGAC AGTTATGGTT CTGGCTTCTC 


ATGACAAGAT 


2040 


GCCAACAATT AAGGCAGAAT TGGAAAAGCA ACCTTTCAAA GGAAAACTGC 


ATGACTTGAG 


2100 


AGTTGATACC CAAGGTGTCC GTGTAGAAGC AAAATAAAGA ATAGAAGATA 


GGATGGGGAA 


2160 


ACTCTTGACC AGAGGGGTTC ATATCCTTTT TGTGAAAAGA AGTTTATACT 


CAATGAAAAT 


2220 


CAAAGAGCAA ACTAGGAAGC TAGCCGCAGG CTGCTCAAAA CAGTGTTTTG 


AGGTTGCAGA 


2280 


TAGAACTGAC GAAGTCAGCT CAAGACACTG TTTTGAGGTT GCAGATAGAA 


CTGACGAAGT 


2340 
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CAGTAACCAT ACTACGGTAA 
TAGTTAAAAA CGTGATAAAG 
TTTTTGGGGC CTAGAGGAAT 
CTACGCTAAT GGTCAAGTCG 
AGAAACGGTC CAAGTGATTT 
TTATTTCCGA GTTATCGATC 
ATATCGAACT GGGATTTATT 
GCAGGAGCAG GAACGCATGC 
CTACATTCTG GCTGAAGACT 
TCATATCGAT GTGACCGATG 
TAGTCAAGAG GTGTTGAAGG 
TGCTACAGAG GCTCCATTTA 
AGATATTACG ACAGGTGAGC 
TTGGCCAAGT TTTAGCCGTC 
CCATGGAATG GAGCGAATTG 
TTTCACAGAT GGACCGCGGG 
ACGCTTTGTG GCCAAGGATG 
AAACAAATAA AACAGAGAGT 
GGGATTTATG AAACACCTAT 
CCCCTTGTTC AAGCTGTTAG 
GATTGTTGAC CAATCTTTAC 
GCTCCTTATC TTTGCAGTAA 
AAAGGCAGCA GTAGGTTCTG 
CTTGCCCAAG GACAGCAGAG 
GGATACCTAC CAGATTCAGA 
CATTATCGTT TTTGGTGCCA 
GTTCTTAGTC TTGGTTGCCA 
TCCTTTCTAC AGTAGTCTCA 
ATTGCAAGGG ATGCGGGTTA 



387 

GGTGACGCTG ACGTGGTTTG AAGAGATTTT CGAAGAGTAT 2400 

GAGAAATAAA GATGGCAGAA ATTTATCTAG CAGGTGGTTG 2460 

ATTTTTCACG CATTTCTGGA GTGCTAGAAA CCAGTGTTGG 2520 

AAACGACCAA TTACCAGTTG CTCAAGGAAA CAGACCATGC 2580 

ACGATGAGAA GGAAGTGTCA CTCAGAGAGA TTTTACTTTA 2640 

CTCTATCTAT CAATCAACAA GGGAATGACC GTGGTCGCCA 2700 

ATCAGGATGA AGCAGATTTG CCAGCTATCT ACACAGTGGT 2760 

TGGGTCGAAA GATTGCAGTA GAAGTGGAGC AATTACGCCA 2820 

ACCACCAAGA CTATCTCAGG AAGAATCCTT CAGGTTACTG 2880 

CTGATAAGCC ATTGATTGAT GCAGCAAACT ATGAAAAGCC 2940 

CCAGTCTATC TGAAGAGTCT TATCGTGTCA CACAAGAAGC 3000 

CCAATGCCTA TGACCAAACC TTTGAAGAGG GGATTTATGT 3060 

CACTCTTTTT TGCCAAGGAT AAGTTTGCTT CAGGTTGTGG 3120 

CGATTTCCAA AGAGTTGATT CATTATTACA AGGATCTGAG 3180 

AAGTTCGTTC TCGTTCAGGC AGTGCTCACT TGGGTCATGT 3240 

AGTTAGGCGG CCTCCGTTAC TGTATCAATT CTGCTTCTTT 3300 

AGATGGAAAA AGCAGGATAT GGCTATCTAT TGCCTTACTT 3360 

GGGGCTTCCC ACTTTCTTCA TTTCTAGAAT ATGAATAGAA 3420 

TATCTTACTT CAAACCCTAC ATCAAGGAAT CAATTTTAGC 3480 

AAGCTGTTTT TGAGCTCTTG GTTCCCATGG TGATTGCTGG 3540 

CTCAGGGAGA TCAAGGTCAT CTCTGGATGC AGATTGGCCT 3600 

TTGGCGTTTT AGTGGCCTTG ATAGCTCAAT TTTACTCAGC 3660 

CTAAGGAATT GACAAACGAT CTTTATCGTC ATATTCTTTG 3720 

ACCGTCTGAC AACTTCTAGT TTGGTCACTC GCTTGACTTC 3780 

CTGGTATCAA TCAATTCCTG CGTCTCTTTT TACGAGCGCC 3840 

TTTTTATGGC TTATCGAATC TCAGCTGAGT TGACTTTCTG 3900 

TTTTGACCAT TGTCATTGTA GGGTTATCTC GATTGGTCAA 3960 

GAAAGAAAAC GGACCAACTG GTTCAGGAAA CGCGCCAGCA 4020 

TTCGTGCTTT TGGTCAAGAA AAACGAGAGT TACAGATTTT 4080 
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TCAAACCCTT AACCAAGTTT ATGCTAGATT ACAAGAAAAG ACAGGTTTCT GGTCTAGTTT 4140 

ATTAACACCT CTGACCTATC TGATTGTCAA TGGAACTCTT CTCGTTATTA TCTGGCAAGG 4200 

CTATATTTCA ATTCAAGGAG GAGTGCTCAG TCAAGGTGCT CTCATTGCTC TTATCAATTA 4260 

CCTCTTACAG ATTTTGGTGG AATTGGTCAA GCTAGCCATG TTGATCAATT CCCTCAACCA 4320 

GTCCTATATC TCAGTCAAGC GAATCGAGGA AGTCTTTGTT GAGGCTCCAG AGGATATCCA 4380 

TTCAGAGTTA GAACAAAAGC AAGCTACCAG AGATAAGGTT TTACAAGTCC AAGAATTGAC 4440 

CTTTACCTAT CCTGATGCGG CCCAGCCTTC TCTGAGATAC ATTTCCTTTG ATATGACTCA 4500 

AGGACAAATT CTAGGTATCA TCGGGGGAAC TGGTTCTGGT AAATCAAGCT TGGTGCAACT 4560 

CTTACTTGGA CTTTATCCAG TAGACAAGGG GAACATTGAC CTTTATCAAA ATGGACGTAG 4620 

TCCTCTTAAT TTGGAGCAGT GGCGGTCTTG GATTGCCTAT GTACCTCAAA AGGTCGAACT 4680 

CTTTAAAGGA ACCATTCGTT CCAACTTGAC TCTAGGTTTC AATCAAGAAG TATCTGACCA 4740 

GGAACTCTGG CAGGCCTTGG AGATTGCGCA AGCTAAGGAT TTTGTCAGTG AAAAGGAAGG 4800 

ACTCTTGGAT GCTCTAGTTG AGGCAGGGGG GCGAAATTTC TCAGGTGGAC AAAAACAAAG 4860 

ATTGTCTATC GCCCGAGCAG TCTTGCGCCA GGCTCCGTTT CTCATCCTAG ATGATGCAAC 4920 

CTCGGCACTG GATACCATTA CAGAGTCCAA GCTCTTGAAA GCTATTAGAG AAAATTTTCC 4980 

AAACACGAGC TTAATTTTGA TCTCTCAACG AACCTCAACT TTACAGATGG CGGACCAGAT 5040 

TCTCCTCTTG GAAAAAGGTG AGTTGCTAGC TGTTGGCAAG CACGATGACT TGATGAAATC 5100 

CAGCCAAGTC TATTGTGAAA TCAATGCATC CCAACATGGA AAGGAGGACT AGAATGAAAC 5160 

GACAAACTGT AAACCAGACG CTCAAACGTT TAGCCGTAGA TTTAGCAAGC CATCCTTTCC 5220 

TCCTTTTCCT AGCCTTTCTA GGAACTATTG CCCAAGTTGG CTTATCAATT TACCTACCTA 5280 

TTCTGATTGG GCAGGTCATT GACCAAGTCC TAGTGGCTGG TTCATCACCA GTTTTTTGGC 5340 

AGATTTTTCT CCAGATGCTC TTGGTGGTAA TAGGAAATAC TCTGGTACAA TGGGCCAATC 5400 

CTCTCCTCTA TAATCGTCTA ATCTTCTCTT ATACCAGAGA TTTACGGGAG CGAATCATCC 5460 

ATAAGCTCCA TCGTTTACCG ATTGCCTTTG TAGATAGGCA AGGTAGTGGA GAGATGGTTA 5520 

GTCGTGTAAC CACGGACATC GAACAGTTGG CAGCTGGCTT GACCATGATT TTTAACCAAT 5580 

TTTTCATTGG TGTTTTGATG ATTTTGGTCA GTATTCTAGC CATGCTCCAA ATTCATCTCC 5640 

TCATGACTCT CTTAGTCTTG CTGTTGACGC CACTGTCCAT GGTGATTTCA CGCTTTATTG 5700 

CCAAGAAATC CTATCATCTC TTCCAGAAGC AAACAGAGAC GAGGGGAATT CAGACTCAGT 5760 

TGATTGAAGA ATCGCTTAGT CAGCAGACTA TAATCCAGTC CTTCAATGCT CAAACAGAAT 5820 

TTATCCAAAG ATTGCGTGAG GCTCATGACA ACTACTCAGG CTATTCTCAG TCAGCCATCT 5880 
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TTTATTCTTC AACGGTCAAT CCTTCGACTC GCTTTGTAAA TGCACTCATT TATGCCCTTT 5940 

TAGCTGGAGT AGGAGCTTAT CGTATCATGA TGGGTTCAGC CTTGACCGTC GGTCGTTTAG 6000 

TGAGTTTTTT GAACTATGTT CAGCAATACA CCAAGCCCTT TAACGATATT TCTTCAGTGC 6060 

TAGCTGAGTT GCAAAGTGCT CTGGCTTGCG TAGAGCGTAT CTATGGAGTC TTAGATAGCC 6120 

CTGAAGTGGC TGAAACAGGT AAGGAAGTCT TGACGACCAG TGACCAAGTT AAGGGAGCTA 6180 

TTTCCTTTAA ACATGTCTCT TTTGGCTACC ATCCTGAAAA AATTTTGATT AAGGACTTGT 6240 

CTATCGATAT TCCAGCTGGT AGTAAGGTAG CCATCGTTGG TCCGACAGGT GCTGGAAAAT 6300 

CAACTCTTAT CAATCTCCTT ATGCGTTTTT ATCCCATTAG CTCGGGAGAT ATCTTGCTGG 6360 

ATGGGCAATC CATTTATGAT TATACACGAG TATCATTGAG ACAGCAGTTT GGTATGGTGC 6420 

TTCAAGAAAC CTGGCTCACA CAAGGGACCA TTCATGATAA TATTGCCTTT GGCAATCCTG 6480 

AAGCCAGTCG AGAGCAAGTA ATTGCTGCTG CCAAAGCAGG TAATGCAGAC TTTTTCATCC 6540 

AACAGTTGCC ACAGGGATAC GATACCAAGT TGGAAAATGC TGGAGAATCT CTCTCTGTCG 6600 

GCCAAGCTCA GCTCTTGACC ATAGCCCGAG TCTTTCTGGC TATTCCAAAG ATTCTTATCT 6660 

TAGACGAGGC AACTTCTTCC ATTGATACAC GGACAGAAGT GCTGGTACAG GATGCCTTTG 6720 

CAAAACTCAT GAAGGGCCGC ACAAGTTTCA TCATTGCTCA CGGTTTGTCA ACCATTCAGG 6780 

ATGCGGATTT AATTCTTGTC TTAGTAGATG GTGATATTGT TGAATATGGT AACCATCAAG 6840 

AACTCATGGA TAGAAAGGGT AAGTATTACC AAATGCAAAA AGCTGCGGCT TTTAGTTCTG 6900 

AATAAGCCAT TCTCTTTTGA AAGTTTATGG ACGAAAAAAG TTGCCTTCGA GTGACTTTTT 6960 

TGTTACAATA GCTAGAAAAA TTGTTCACTG TAATACTCAA TGAAAATCAA AGAGCAAACT 7020 

AGGAAGCTAG CCGTAGGTTG CTCAAAGCAC AGCTTTGAGG TTGTAGATAA GACTGACGAA 7080 

GTCAGTTCAA AACACTGTTT TGAGGTTGCA GATAGAACTG ACGAAGTCAG CTCAAAACAC 7140 

TGTTTTGAGG TTGCAGATAG AACTGACGAA GTCAGCTCAA AACAGG 7186 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14273 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

CTGAAAATTC TAAAAAATTT ATAAGTAAGG AATTAATTAG TTATTTTTGT GATAAAGTTT 60 
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ATGATGAAAT ATTTGTTGAA GAGGTAGTTC CGCACGTTTT TCTGCCATAT 


GAATCTGACT 


120 


TACTTCTTAT 


TTTACCAGCT ACGGCAAATG TGATTGGCAA AATTGCTAAT 


GGTATTGCTG 


180 


ATGATTTAGT 


TACAGCAACT GTTTTAAACT TTAATAAAAA AATAATTTTT 


TGTCCCAATA 


240 


TGAACTCTAC 


TATGTGGGAC AATCACATAG TTCAAAGAAA TGTATCAATT 


CTAAAGGAGT 


300 


TGGGACATAT ATTTTTATTT GAGTCTAAAA AAACATATGA GGTAGGATTG 


CGTAAAGCAA 


J DU 


TAGATTCAAC 


ATGTTCAATG TTACAACCAC AGTCGTTAGT AAAAGAACTT 


ATCAAATTAG 


AD ft 


AAAATATTGT 


CCTTGAAGAG GGACATTAAA AACTACTGAG AATATTAATG 


AGGGGAAAAA 




ATGGAAAATT 


CATCAATCGA TGTAGATATG CTGTTGGAAG AATTGACACA 


AGAAGCAATG 


540 


GTCGTTGTTG 


CTGTTGATAA GGACTGTTAA TTTAAACTTA TGGCAATATA 


TGAAAGGTTA 


600 


CTGGATGTTT 


TAAATTATGC AGGCAGTAGC CTTTTATTAT ATACAAATGG 


ATAAAGTAAG 


DbO 


GATAATACAA 


TGATTAATAA AAAAATACAA CAAGTTGTTT TGGAATCATT ACAGAATTTT 


720 


TTGAATGGGA 


ACTTCATTTC GCCTTGTGTA GTCTATGATT TTGGCTTGCT 


GGAAACTGTA 


780 


CTTGATGAAT 


TTAAAAATCA AATTCCTGTA ACATTCAATT ACCAACTTTT 


TTATGCCGTT 


840 


AAAGCAAATT 


CAAATGAGAA GATACTTGAA TTCTTAGTAG ATAAAATTGA 


TGGAGTTGAT 


900 


GTGGCGTCAT 


TATCTGAATT AGATGTGGCT AAAAAATTTT TCCCACCAAC 


TCAAATTTCT 


960 


GTTAATGGTC 


CCGCATTTTC TTATGAAACT TTATATAATC TGATTAAAAA 


ACAATATAAA 


1020 


GTTGATATTA ACTTTTTGGA ACATCTTCAA CAATTTTCCC CAAAAGAATC 


TGTTGGAATA 


1080 


AGAGTAACGG 


AGCCAGATGA ACTTAATAAT CGTATGAGTC GATTTGGAAT AAATATTTGC 


1 1 A A 


AGTGATAATT 


GGACTAGTAA TTTACAAAAT CCTTTAATTA CACGACTGCA 


TTTTCATTTT 




GGAGAAAAAG 


ATGATAAATT TATTGTTAAG TTAGATAAAA TATTATTTAA 


GTTACAAGAA 


i Ten 


ATTAATAAAC 


TTAGAGAGGT TAGAGAAATA AATCTTGGAG GCGGTTTTAT 


GAAATTATTT 




ATGGAAAATC 


GTTTGAAAGA ATTTTTTCTA TCACTTATGG AAATCTATAA AAAGTACGAT 


1380 


ATTGATAGTA 


CTGTGACTAC AATAATAGAA CCAGGTAGTG CAATTACTTC 


ATTTTCTGCC 


1440 


TATATGATTA 


CTAGCCCAGT TAATGTTAGT GAGGTGAATG AGCAGCAGGT 


TATCACGTTA 


1500 


GACACATCAA 


TATACACCAA TACATTATGG TTTGTTCCGC ATATTATTAC 


AACGTTAAAT 


1560 


TCAAGTAGTA AAGAGCGTTA TAGTACTATT CTCTATGGTA ATACCTGTTA TGAACATGAC 


1620 


AAGTATAAAA 


TGAAAGTTTC GCTTCCAAGG TTAACTCAAA ATAGCAGTAT AGTGTTTTTT 


1680 


CCTGTAGGAG 


CTTATATAAA AAGCAATCAT TCAAATTTAC ATCGTAATGA 


TTTTATGCGG 


1740 


GAGGTATATT 


TGTGGACAAA AAACTTGACA TATTAGATAA AGTTAAGGAA TATTTAGGAA 


1800 


ATAAAACTAC 


TCAAATTCTG GATAATCAAT ATAAAGAATT TTTGAAACTT AATGATATAA 


1860 
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GGCGAGCGTT 


TGGTATTTCA 


GAAAAAGTAT 


TAAACAATTC 


TTTTAATTTT 


ACGAGTAAAG 


1920 


AATTTAATGA TTTAATTAAT 


AACGAAAATT 


ATTTATTCGA 


ATATGCATGT 


AGAATTAGAG 


1980 


AGGAATGGAG 


AAAAAAATGC 


TTTAATCATT 


CTTATCGTTT 


TCTATGCTCA 


CCTATAATTA 


2040 


CAGATGATTT 


TCTTAACACG 


AAGACATTGA GAAGTAGCCA AATTGAATAT 


AAATATGAGC 


2100 


GATATTTATC GAAAAGTTCG 


ATAGGCGATA 


GAGCGGTTGA 


TGGCTTTGTT 


TCCTTCAATA 


2160 


CTTTAACAGC 


TAATGGTATG 


TCTGCTATTA 


AACTATGTCT 


TGAGATATTA 


AACTCTATTT 


2220 


TCTTCAAGAA 


GAAGATTGAT 


TTATTATATT 


CAACCGGATA 


TTATGAAACA 


AGATTTTTAT 


2280 


TAAATAATCT 


TGCTAAATCA 


GGTATTAGTT 


GCTATGAGGT 


AAGTAATTGT 


GAATTGGATA 


2340 


AAGATAAATT 


TTATAATGTA 


TTCATGATGG 


AACCCAATCG 


AGCCGATTTA 


ACATTACAAA 


2400 


AAACTGATTT 


CAAGATAGTA 


GAATATTTTG 


TTAAGTATAA 


AAATAATTCA 


ATAAAAGTCG 


2460 


TTATTTTAGA 


TATTTCATAT 


CAAGGTTCTA 


ATTTTAAATT 


AGTAGAATTT 


TTAGAGAAAT 


2520 


TTAAATTTGC 


GAATGTAATT 


ATTTTTGTGG 


TACGATCTTT 


GATAAAATTA 


GATCAAATGG 


2580 


GATTAGAATT 


GACAAATGGG 


GGAATAATAG 


AAGTGTTTAT 


TCCTAATCAT 


TTGAGAAAGT 


2640 


TGAAAAATTT 


TATTGAAGAG 


GAATTCAATA AATTTAGAAA 


TTCTCACGGA 


GCTAATCTAA 


2700 


GCCTCTATGA 


ATACTGTTTG 


CTTGATAATT 


CTTTAACTTT 


AAAAAATGAT 


TGGAACTATT 


2760 


CTGATTTAGT 


TATGAAATTT 


ACGAGTAATT 


TTTATGCTGA 


TATAAAAGAC 


TTGTTCATGG 


2820 


AAAATTCTGA 


TATTGAAATC 


ATCCATGAAG 


AGGGAGTACC 


TTTTGTATTT 


TTAGATTTAA 


2880 


TAGGTGAAGG 


TAAAAAAGAA 


TATGAAATGT 


TTTTTCAATG 


GTTAAACTTG 


TTTTACAAAC 


2940 


AGCTTGGAAT 


CACATTGTAT 


GCTAGAAATA 


GTTTTGGGTT 


TCGGAATCTA 


ACAGTAGAGT 


3000 


ATTTTGGAAT 


TATTGGGACA 


GAAAGATATA 


TATTTAAGAT 


TTGTCCAGGT 


GTTTATAAAG 


3060 


GGTTAAGTTA 


TTATTTGATG 


AAATTTTTAT 


TAAAATCTTT 


TTCAAATGAA 


TATTTAAAAA 


3120 


CTACTGATGA 


GGTTAATAGA 


TGAAAAATTT 


GATAAAGTTG 


CTAATAATTA 


GATTGATTGT 


3180 


TAACTTAGCA 


GACAGTGTAT 


TTTATATAGT 


AGCATTGTGG 


CACGTTAGCA 


ATAATTATTC 


3240 


TTCGAGCATG 


TTCTTAGGAA 


TATTTATTGC 


AGTAAATTAT 


CTACCGGATT 


TGTTACTAAT 


3300 


CTTTTTTGGA 


CCAGTTATTG 


ACAGAGTAAA 


TCCGCAAAAA 


ATTCTTATAA 


TATCAATTTT 


3360 


GGTTCAATTA 


GCAGTGGCTG 


TAATATTTTT 


ATTATTATTA 


AACCAAATAT 


CATTTTGGGT 


3420 


GATAATGAGT 


CTAGTGTTTA 


TTTCAGTAAT 


GGCTAGCTCC 


ATAAGTTACG 


TGATAGAAGA 


3480 


TGTGTTGATT 


CCTCAAGTGG 


TAGAATATGA 


TAAGATTGTA 


TTTGCAAATT 


CTCTTTTTAG 


3540 


TATTTCGTAT 


AAAGTATTAG 


ATTCTATTTT 


TAATTCATTC 


GCATCATTTT 


TACAGGTGGC 


3600 
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AGTAGGATTT ATTTTATTGG TTAAGATAGA TATAGGCATA TTTTTACTTG CTCTATTTAT 3660 

ATTGTTGTTG TTAAAATTTA GAACTAGCAA TGCGAATATA GAAAACTTCT CTTTCAAATA 3720 

TTACAAGAGA GAAGTGTTGC AAGGTACAAA GTTTATTTTA AATAATAAAT TATTATTTAA 3780 

AACCAGTATT TCTTTAACGC TTATAAACTT TTTTTATTCA TTTCAGACAG TAGTTGTACC 3840 

GATTTTTTCT ATTCGATATT TTGATGGTCC GATTTTTTAT GGTATTTTTT TAACTATTGC 3900 

TGGTTTGGGT GGTATATTGG GAAATATGCT AGCGCCAATC GTAATAAAAT ATTTAAAATC 3 9 60 

GAATCAAATT GTTGGTGTAT TTCTTTTTTT GAACGGCTCA AGTTGGTTAG TAGCAATTGT 4020 

TATAAAAGAC TATACTTTAT CACTTATTTT ATTTTTCGTT TGTTTTATGT CTAAAGGAGT 4080 

CTTCAATATT ATTTTTAATT CGTTGTACCA ACAAATACCT CCACATCAAC TTCTTGGTAG 4140 

GGTAAATACT ACCATTGATT CTATTATTTC TTTTGGAATG CCAATTGGTA GTTTAGTTGC 4200 

AGGAACGCTT ATTGATTTGA ATATTGAATT AGTGTTAATT GCTATTAGCA TACCTTATTT 4260 

TTTGTTTTCT TATATTTTTT ATACGGATAA TGGATTGAAA GAATTTAGTA TATATTAGAA 4320 

ATGTTTATGT TCATTCAAAA GCATAATGAC TATAACTGAA AAAGAAAAGT GATATCTTTA 4380 

AGGTTGTTCT TCTTGGTGGT GAGATTCGTG AGACAACCCA AGCTTTTGTC GGAAAGATTA 4440 

CCAATGCTTT GATGGATAGG ATGTACTTTA GCAAGATGTT TTTAGTGGTA ACGGTATCGT 4500 

GGATGGACGT GTAATAACCT CTTCTTTCGA GGAGTATTTT ACTAAAAAAC TAGCCTTGGA 4560 

GCGTTCCCCA GAAACGGACT TACTCATTGA CTCTTCAAAG ATTTGGGGAG AAGATTTTGC 4620 

TTCATCTGTT CCTTGAAAAA AGTCACAGCA GTCATCACAG ACGATAGTAC TGAACAAAAC 4680 

TATGAAGAGT TAGAAATTTA TACGCAGGTG ATTGTATAAA GGATCTGGAA ATAGATAAGA 4740 

AGTTGATTAG TATTGACCTA GGTGGTACAA ATATTAAGAT TACTGTTCTT TCAAATGACG 4800 

GTGAGATTGA AACTTTGTGG AGTATTACAA CAGATACAAG TGAGAAAGGT TCTCAAATTA 4860 

TATCGGACAT CATCAGTTCT ATTAAAAATA AATTGACCGA ACGGAATATT CCTGATAGCG 4920 

ACCTTCTTGG AATCGGTATG GGAAGTTGCT CATCATACTT TCCTTGTAAA TCATAGGGGC 4980 

TATAAACTCT CCGTCTACTT GTCCTGCAAC AATTGAAGTC TGCTCAAAAC GCCGTCCGCT 5040 

AATCTTTTCA TAGACTTTCT CCCTTTTAGG AGCCTAGCTT TCTAGTTTGT TCTTTGATTT 5100 

TTATTGAGTA TACCACTATT TTACTCCCTC TGGCAAGGGA CTTTGtCTAT GTGGAGGGAT 5160 

TGGGCTCCTA TGTGGTGGAG CTTTTCTGTT CTTTCTGAAA TATGGTATAA TAGCACTAAT 5220 

CAATTTCTAG GAAAATAGAT ACAGAAAGGG GCTGAAAGAT GTCTCATATT ATTGAATTGC 5280 

CAGAGATGCT GGCAAACCAA ATCGCGGCTG GAGAGGTCAT TGAACGTCCT GCCAGTGTGG 5340 

TCAAAGAGTT GGTAGAAAAT GCCATTGACG CGGGCTCTAG TCAGATTATC ATTGAGATTG 5400 
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AGGAAGCTGG 


TCTCAAGAAG GTTCAAATCA 


CGGATAACGG TCATGGAATT 


GCCCACGATG 


5460 


AGGTGGAGTT 


GGCCCTGCGT CGCCATGCGA 


CCAGTAAGAT AAAAAATCAA 


GCAGATCTCT 


5520 


TTCGGATTCG 


GACGCTTGGT TTTCGTGGTG 


AAGCCTTGCC TTCTATTGCG 


TCTGTTAGTG 


5580 


TCTTGACTCT 


GTTAACGGCG GTGGATGGTG CTAGTCATGG AACCAAGTTA GTCGCGCGTG 


5640 


GGGGTGAAGT 


TGAGGAAGTC ATCCCAGCGA 


CTAGTCCTGT GGGAACCAAG 


GTTTGTGTGG 


5700 


AGGATCTCTT 


TTTCAACACG CCTGCCCGTC 


TCAAGTATAT GAAGAGCCAG 


CAAGCGGAGT 


5760 


TGTCTCATAT 


CATTGATATT GTCAACCGTC 


TGGGCTTGGC CCATCCTGAG 


ATTTCTTTTA 


5820 


GCTTGATTAG 


TGATGGCAAG GAAATGACGC 


GGACAGCAGG GACTGGTCAA 


TTGCGCCAAG 


5880 


CAATCGCAGG 


GATTTACGGT TTGGTCAGTG 


CCAAGAAGAT GATTGAAATT 


GAGAACTCTG 


5940 


ACCTAGATTT 


CGAAATTTCA GGTTTTGTGT 


CCTTGCCTGA GTTGACTCGG 


GCTAACCGCA 


6000 


ATTATATCAG 


CCTCTTCATC AATGGCCGTT 


ATATTAAGAA CTTCCTGCTC 


AATCGTGCTA 


6060 


TTTTGGATGG 


TTTTGGAAGC AAGCTTATGG 


TTGGACGTTT TCCACTGGCT 


GTCATTCACA 


6120 


TCCATATCGA CCCTTATCTA GCGGATGTCA ATGTGCATCC AACTAAGCAA GAGGTGCGGA 


6180 


TTTCCAAGGA AAAAGAACTG ATGACTCTGG 


TTTCAGAAGC TATTGCAAAT 


AGTCTCAAGG 


6240 


AACAAACCTT 


GATTCCAGAT GCCTTGGAAA 


ATCTTGCCAA ATCGACCGTG 


CGCAATCGTG 


6300 


AGAAGGTGGA 


GCAAACTATT CTCCCACTCA 


AAGAAAATAC GCTCTACTAT 


GAGAAAACTG 


6360 


AGCCGTCAAG 


ACCTAGTCAA ACTGAAGTAG 


CTGATTATCA GGTAGAATTG 


ACTGATGAAG 


6420 


GGCAGGATTT 


GACCCTGTTT GCCAAGGAAA 


CCTTGGACCG ATTGACCAAG 


CCAGCAAAAC 


6480 


TGCATTTTGC 


AGAGAGAAAG CCTGCTAACT 


ACGACCAGCT AGACCATCCA 


GAGTTAGATC 


6540 


TTGCTAGCAT 


CGATAAGGCT TATGACAAAC 


TGGAGCGAGA AGAAGCATCC 


AGCTTCCCAG 


6600 


AGTTGGAGTT 


TTTCGGACAA ATGCACGGGA 


CTTATCTCTT TGCCCAAGGG 


CGAGATGGAC 


6660 


TTTACATCAT 


AGATCAGCAC GCTGCTCAGG 


AACGGGTCAA GTACGAGGAG 


TACCGTGAAA 


6720 


GCATTGGCAA 


TGTTGACCAA AGCCAGCAGC 


AACTCCTAGT GCCCTATATC 


TTTGAATTTC 


6780 


CTGCGGATGA TGCCCTGCGT CTCAAGGAAA GAATGCCTCT CTTAGAGGAA GTGGGCGTCT 


6840 


TTCTAGCAGA 


GTACGGAGAA AATCAATTTA 


TTCTACGTGA ACATCCTATT 


TGGATGGCAG 


6900 


AAGAAGAGAT TGAATCAGGC ATCTATGAGA TGTGCGACAT GCTCCTTTTG ACCAAGGAAG 


6960 


TTTCTATCAA GAAATACCGA GCAGAGCTGG 


CTATCATGAT GTCTTGCAAG 


CGATCTATCA 


7020 


AGGCCAATCA 


TCGTATTGAT GATCATTCAG 


CTAGACAACT CCTCTATCAG 


CTTTCTCAAT 


7080 


GTGACAATCC 


CTATAACTGT CCTCACGGAC 


GTCCTGTTTT GGTGCATTTT 


ACCAAGTCGG 


7140 
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ATATGGAAAA GATGTTCCGA CGTATTCAGG AAAATCACAC CAGTCTCCGT GAGTTGGGGA 7200 

AATATTAAAA GTATAAAAAA GTCTGGGAAA AATTTTCAAA ATCAAAAAAA CGCATAAAAT 7260 

CAGGTGTTCA AAAACCTTGA TTTTATGCGT TTTATCATGG AAATAGTTAC TTCATTTTTT 7320 

CCTAATTCTT TTCGAAACTC TTTTTAAACG ACGTCAGTTT TATCAGTAAT CTCAAAACAG 7380 

TGTTTTGAGC TAATTTTGCC AGTTTTGTCT GTAACATCGA AGTTGTGTTT TACCACTCTG 7440 

CGACTGGTTT CCTAGTTTGC TCTATGATTT TCACAGAGCA TTAAATTGCG ATTTTGCCAA 7500 

GTTTCTTTAT TCGTCTAAAA GTAGAGTCTG TTCTATGCGT CTAATGTACG AATCAGGTTG 7560 

ACCATTTCAA TAGCTCCTTG TGCACACTCA GAACCCTTAT TTCCTGCTTT AGTACCAGCT 7620 

CGTTCTATGG CTTGTTCAAT TGTATCTGTC GTTAGCACAC CAAACATAAC AGGAATTTCG 7680 

CTATTTAAAC TGATTTGGGC GATTCCCTTA GATACCTCGC TACATACATA ATCATAATGA 7740 

CTTGTATTCC CTCTAATGAC AGCTCCCAAG CAGATAATTG CATCATATTT TTTACTTTTT 7800 

GCCATTTTTG ATGCAATCAG TGGTATTTCA AAAGCTCCTG GAACCCAGGC TACCTCTATA 7860 

TCTTTCTCGT TTACATTCTC TCTTTTGAGA TTATCTAGTG CTCCAGATAA TAATTTTGAA 7920 

GTTATAAATT CATTAAATCT CGCTACAACA ATACCTATTT TAATATTGTT TGCTACTAAA 7980 

TTACCTTCAT AAGTGTTCAT TTATTTTTCC TCCATATTTA AAATGTGACC CATTCGATTT 8040 

TTCTTTGTTT CTAAATAAAA ACTATCGTAA GGATTGGCTT CTATTTCGAT TGATATTCTA 8100 

CTGGAAATGG TAATTCCATA TTTTTCTAAC TGTTCAACCT TGTCAGGATT ATTTGTCAGT 8160 

AAATGAAGTG ACTGAAGTCC CAGATCTTTA AGCATTTTTG CTCCAATATG ATATTCTCTT 8220 

AAATCACCTT CAAAGCCTAA TGCAAGATTG GCATCAAGCG TATCCATGCC TTGATCTTGT 8280 

AAATGATAGG CTTTTAATTT ATTGATAAGT CCAATTCCTC GTCCCTCCTG TCGCAAGTAA 8340 

AGTAAGACAC CCGAACCATT CTCAACAATC ATTTTCATAG CCTTATCGAA TTGCTGTCCA 8400 

CAATCGCAAC GTAAAGAGCC TAAAACATCT CCTGTTAAAC ATTCGGAGTG GACCCGACAT 8460 

AATACATTGG CTTCATCCTC TATATTTCCC ATAATAAGAG CAAGATGATG TTCCCCATTT 8520 

AGTTTATCTA TATAGCTAAT TGCTTTGAAA TTACCGTATC TAGTAGGCAT ATTGACAGTT 8580 

GAAACTCGTT CTACCAGCTG ATCATATACT TTTCTATATT CTTGTAATTC TTTGATGGTA 8640 

ATTAGTGGAA TGTTGTGTTT TTTCGAGAAC TGAATTAAAT CATCTGTTCT CATCATTTTG 8700 

CCATCATGAT TCATTATTTC ACAACATAGG CCACACTCTT TTAGTCCAGC TAATTTTAAT 8760 

AAATCAACAG TTGCTTCTGT GTGTCCATTT CTTTCTAGGA CACCACCTTT TTTTGCAATT 8820 

AAAGGAAACA TGTGTCCTGG CCTGCGAAAA TCAGAGGGTG TTATATCTTC AGCTACACAC 8880 

ATACGTGCGG TCAGTCCTCT TTCCTCGGCA GAAATACCTG TGGTCGTTTC TTTATAATCA 8940 
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ATTGAAACTG 


TAAAAGCAGT CTTATGATTA 


TCTGTATTGT 


TTTCAACCAT 


AGGTGAAAGC 


9000 


ATTAATTGAT TAGCTAAACT TTCGCTCATA GGCATACAAA TTAATCCTTT GGCATAAGTA 


9060 


GCCATAAAAT 


TAACATTTTC TGTTGTAGCT 


GCTTGTGCAG 


AACAAATTAA 


GTCTCCTTCA 


9120 


TTTTCTCTAT 


CCTTGTCGTC TATAACAAGA 


ACAAGTCGTC 


CCTTCTGCAA 


TGCTTCTAAT 


9180 


GCTTCTTGTA TTTTTCGATA TTCCATTGAC TGATTATCCT 


TTCTGCTAAA 


ATCCATTTTG 


9240 


ATATAATAGT 


TCCTTAGATA TTTCTGATTT 


TGGAGAGTTA 


TCCATCAGTT 


TTTGCACATA 


9300 


TTTACCTAAG 


ATATCATTTT CAAGATTTAC 


TGTACTCCCG 


ACTTGTTTAC 


TCTTAAGAAT 


9360 


GGTTTGTTCC 


AAGGTATGAG GGATAACAGA 


TACTGAAAAG 


TTTACTTTGG 


AGACTTTAGC 


9420 


GACAGTCAGA 


CTAATGCCGT CAATTGTAAT 


AGATCCTTTT 


TCAACTATTA 


AATCTAAAAT 


9480 


TTCTTTTTGT 


GTGTTGATTT GATACCATAC 


AGCATTATCA 


TCTTTTTTTA 


TTGACGAGAT 


9540 


TTTTCCTGTA 


CCATCAATGT GTCCTGTAAC 


GACGTGACCC 


CCAAGTCGAC 


CGTTGACAGA 


9600 


TAAGGCTCTT 


TCTAGATTCA CCTCACTTCC 


ATGTTTTAAT 


AGAGTAAGAG 


CTGTTCGACT 


9660 


CCATGTTTCA 


TTCATTACAT CAACTGTAAA 


GGATTGATGA 


TTGAAATGAG 


TAACTGTAAG 


9720 


ACAGATACCA 


TTTACTGCTA TACTATCGCC 


TAAATGGATA 


TCCGTTAATA 


TTTTTGAGGC 


9780 


TTTAATTGAT 


AGTTTACAAT TACGAGAGTC 


TTTCTGTATT 


CTTTCAACTT 


TTCCGATTTC 


9840 


TTCAATTATT 


CCTGTGAACA TGGATAAATC 


ACTTCACTTT 


CTATGAGATA 


GTCATTTCCT 


9900 


ATTTGAGAAA ATGCATAAGG TTTCAATCTA ATAGCGTCAT TTGGCAAAGA 


AATACCTTCA 


9960 


CCTCCGACAG 


GAAACTTGGC ACTACCTCCA 


AAAACTTTTG 


GTGCAATATA 


TATTTTCAGC 


10020 


TCATCAACAA TTTGTTGTTC CAAAGCACTC 


CAATTCATTA 


GACTGCCCCC 


TTCTAGAAGT 


10080 


AGGCTATCAA 


TCTGCATGTT TCCTAGATGT 


TGCATTAAAC 


TCGATAAGTC 


TATATGATTG 


10140 


CCTTTTTTCT TTATGGAAAG TATTTCACAG CCATGATTTT 


GATATAGCTT 


CATTTTATTT 


10200 


TTGTCTTCAG 


AGGAAGTGGC AATGTAAGTT 


TTAATATCAT 


TTGCTGTTTT 


TACGATTTTA 


10260 


GAGGTAAGAG GAGTTCGTAA ATGTGTATCG GATATGATAC GGATAGGATT TTTCCCTTCC 


10320 


TCCAATCTAC ATGTCAGCAA AGGATCGTCT TGAATAACAG TATTGACTCC CACCATAATT 


10380 


GCACTAACAT 


GGTGTCGTAA CTGATGCACA 


TGCTTTCTTG 


CTTCTTGTTC 


AGTAATCCAT 


10440 


TTGGATTGAT 


TTGTTTTAGT GGCTATTTTT 


CCATCCATTG 


ACATTGCATA 


TTTCATAAAA 


10500 


ACATAGGGTA 


CATGCTGGGT AATATACTTT 


CTAAAACTTT 


TTATTAAGTT 


AAGACACTCA 


10560 


TTTTCTAAAA TTCCAACAGT AACTTGAAGA TTATTTTCCT CAAGTATCTT 


TACTCCTTTT 


10620 


CCAGATACAA 


TAGGATTACA GTCTAGGCTT 


CCAATGACTA 


CTCTTGTAAT 


ACCACTATCG 


10680 
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ATTATAGCAT CTATACAGGG AGGTGTTTTC CCGAAGTGAC AACAGGGTTC AAGTGTTACA 10740 
TAAAGCGTCG CTCCGACAGG GGATTCTCTA CAGTTTTTAA GAGCATTTCT CTCAGCATGT 10800 
GGGCCACCAA AAAACTCATG ATAACCTTGT CCGATAATGT GATTATCTTT TACAATAACT 10860 
GCGCCGACCA TAGGATTGGG ATTGACGTAA CCAGCCCCTT TTTGTGCCAG TTTTATTGCT 10920 
AATTTCATAT ATTTTGAATC GCTCATCTCG CTACCTCCAA AAAAATATAC CTTGAATAGG 10980 
GGACTACTCA AGGCATACAA AAGAAAACTT ATGCGATTAA CAAAAATGCT CTGAAATGAC 11040 
AAGTAATCAT TTCAGAGCAC GCAAAAAGCA CAAATATACT TTTATCTTCT TTCATCCAGA 11100 

CTATACTGTC GGCTTTGGAA TTTCACCAAA TCATGCCTTT CGGCTCGTGG GCTATACCAC 11160 

CGGTAGGGAA TTTCACCCTG CCCTGAAGAT AGTTATTCAA TTACAGATGA TTATAGTACT 11220 

TAATTTTGAA TATGTCAACA GATAAATACC GATTGTTTTT GATATACTGT ATTTGTGATA 11280 

ATCGATTCTC GCTCCTCGGA TAAAGAAAAT ATGATATACT AGATAAACGA AATAAGAGAG 11340 

AAGGAATACT ATGTACGCAT ATTTAAAAGG AATCATTACC AAAATTACTG CCAAATACAT 11400 

TGTTCTTGAA ACCAATGGTA TTGGTTATAT CCTGCATGTG GCCAATCCTT ATGCCTATTC 11460 

AGGTCAGGTT AATCAGGAGG CTCAGATTTA TGTGCATCAG GTTGTGCGTG AGGACGCCCA 11520 

TTTGCTTTAT GGATTTCGCT CAGAGGATGA GAAAAAGCTC TTTCTTAGTC TGATTTCGGT 11580 

CTCTGGGATT GGTCCTGTAT CAGCTCTTGC TATTATCGCT GCTGATGACA ATGCTGGCTT 11640 

GGTTCAAGCC ATTGAAACCA AGAACATCAC CTACTTGACC AAGTTCCCTA AAATTGGCAA 11700 

GAAAACAGCC CAGCAGATGG TGCTGGACTT GGAAGGCAAG GTAGTAGTTG CAGGAGATGA 11760 

CCTTCCTGCC AAGGTCGCAG TGCAAGCAAG TGCTGAAAAC CAAGAATTGG AAGAAGCTAT 11820 

GGAAGCCATG TTGGCTCTGG GCTACAAGGC AACAGAGCTC AAGAAAATCA AGAAATTCTT 11880 

TGAAGGAACG ACAGATACAG CTGAGAACTA TATCAAGTCG GCCCTTAAAA TGTTGGTCAA 11940 

ATAGGAGCAG AGAATGACAA AACGTTGTTC GTGGGTCAAG ATGACCAACC. CGCTCTACAT 12000 

CGCCTATCAT GATGAGGAGT GGGGCCAGCC CCTCCATGAT GACCAAGTAT TGTTTGAG1T 12060 

GTTGTGTATG GAAACCTATC AGGCAGGCCT GTCTTGGGAA ACGGTACTCA ACAAACGCCA 12120 

AGCTTTCCGA GAAGTCTTTC ATAGCTATCA AATTCACTCA GTCGCAGAGA TGACTGACAC 12180 

TGAATTGGAA GCCATGCTGG AGAATCCAGC TATCATTCGA AATAGAGCCA AGCTTTTTGC 12240 

TACACGCGCT AACGCCCAAG CCTTTCTACA GTTACAGGCA GAGTACGGCT CTTTTGATGC 12300 

CTATCTTTGG TCTTTTGTTG AGGGGAAAAC TGTCGTTAAC GATGTTCCTG ATTATCGCCA 12360 

1 AGCGCCAGCT AAAACACCCT TATCTGAGAA ATTAGCCAAA GATCTCAAAA AACGAGGCTT 12420 

CAAGTTCACA GGCCCAGTCG CCGTATTGTC TTTTCTACAG GCTGCAGGGC TAGTTGATGA 12480 
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CCACGAGAAT 


GATTGTGAGT GGAAAGGTCT TAAATGATGT. CTAACAAAAA TAAGGAAATT 


12540 


CTGATTTTTG 


CGATTCTCTA TACAGTCCTC TTTATGTTTG ATGGCGTTAA ATTGCTGGCT 


12600 


TCTTTAATGC 


CATCTGCCAT TGCAAATTAT CTTGTTTATG TAGTTTTAGC TCTATATGGC 


12660 


TCCTTCTTGT 


TCAAGGATAG ATTGATCCAA CAATGGAAGG AGATTAGAAA GACTAAAAGA 


12720 


AAATTCTTCT 


TTGGAGTCTT AACAGGATGG CTCTTTCTCA TTCTGATGAC TGTTGTCTTT 


12780 


GAATTTGTAT 


CAGAGATGTT GAAGCAGTTT GTGGGACTAG ATGGACAAGG TCTAAATCAG 


12840 


TCTAATATTC AAAGTACCTT TCAAGAACAA CCACTACTGA TAGCTGTTTT TGCTTGTGTC 


12900 


ATTGGACCTC 


TGGTAGAAGA ATTATTTTTC CGTCAGGTCT TATTGCATTA CTTGCAGGAA 


12960 


CGGTTGTCAG 


GTTTACTAAG CATTATTCTG GTAGGACTTG TTTTTGGTCT GACTCATATG 


13020 


CACAGTTTGG 


CTCTATCAGA GTGGATTGGT GCAGTTGGTT ACTTAGGTGG AGGCCTTGCC 


13080 


TTTTCTATTA 


TTTATGTGAA AGAAAAAGAG AATATCTACT ATCCCCTACT TGTTCACATG 


13140 


TTAAGCAACA 


GCCTCTCCTT AATCATTTTA GCTATCAGTA TAGTAAAATG AAATGAGAAC 


13200 


AGGACAAATC 


GATTTCTAAC AATGTTTTAG AAGTAGAGGT GTACTATTCT AGTTTCAATA 


13260 


TACTGTAATA 


TGTGATGAAA ATGCCAGTAA TGATACCGAG AAAAAAGCTG AGAAACTTTT 


13320 


CCCAGCTTTA 


TTTGTTATAG TCAAAGAGAA TGACTTGTTC CTGTGCATCT ACATGAGCAT 


13380 


GGACCCCAAA 


GGGTACAATT GCTCTTGGAG TTGCGTGGCC GACATTCAGA TTATAGACAA 


13440 


TCGGGATATT 


GCTGTCAATG ATATCCAATA GTGCCTCTTT ATAGTCGTCA TGGAAAGTTT 


13500 


CATCCATAGG 


TTTTCCGACC AAGAGTCCAT TGATGACCGC GAATATGCCA GTGTCCTTTA 


13560 


AAGTTAGCAA 


CATCTTTTTG AAGTCTTCTG GCTTAGGCTT TTCTTCGCTT GTTTCGAGCA 


13620 


AGAGGATTTT 


CCCTTCCCAG TCTGACAAGT CAGGGAAAAG TTTGTATTTT TGGCAGAGTT 


' 13680 


CCGTGCTATC TGCGTATCGA GAGTTGTCAA AGATATCGTA GAGGGATTCG AGGCAACCAC 


13740 


CGAGGATTTT 


CCCCTCGAAC TGGGCACTTC CTTGCAACAA GTCAAAACGT GTATTTGTAT 


13800 


GACTGACACG 


AGGTGTTCCC AGGGCCGTGG GACTAAAATC AGTTCGTTCC TCATACCAAA 


13860 


CGTCACTAGG GCGGATTTCT GAAATTCTTC CCGTCTCAAT CAATTCTTTA AAGTAGTGAA 


13920 


GGCTATAGGC 


TAGCATTTCT TTGTCTAATT CACAAATGTC TGCTAAAAAG GATTGACCAT 


13980 


AAAAAGTCTT 


GATTCCTAAT TTATGCAACA TGAGGTGGTT CATGGTTGTA TCCGAGAAGC 


14040 


CAAGAAAAAT TTTTTGCTTG ATAACCTTTT GGAGTTGGTC ATTTTCAAAA AGATAAGGTA 


14100 


GCAAGCGATA 


GGTATCGTCT CCACCGATGG CACATAGGAT CATGTCGATG CTATCATCAG 


14160 


AAAAGGCATG 


AATCAAATCC TCTGCACGAG CTTCAGGATG GTCCTTGATA AAGTCTAATC 


14220 
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CTTTTAACGA ATGGGGCAAA AAGATGGGAT TGGTCCCAGA TCCTTGAGAC GTT 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9828 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



14273 



txi) SEQUENCE DESCRIPTION : SEQ ID NO: 41: 
GTGAAGTGCG GCAAAAGGTG CAAGTGATGA GCTCAGGTTC TTTAGCTCTT GACATTGCCC 
TTGGCTCAGG TGGTTATCCT AAGGGACGTA TCATCGAAAT CTATGGCCCA GAGTCATCTG 
GTAAGACAAC GGTTGCCCTT CATGCAGTTG CACAAGCGCA AAAAGAAGGT GGGATTGCTG 
CCTTTATCGA TGCGGAACAT GCCCTTGATC CAGCTTATGC TGCGGCCCTT GGTGTCAATA 
TTGACGAATT GCTCTTGTCT CAACCAGACT CAGGAGAGCA AGGTCTTGAG ATTGCGGGAA 
AATTGATTGA CTCAGGTGCA GTTGATCTTG TCGTAGTCGA CTCAGTTGCT GCCCTTGTTC 
CTCGTGCGGA AATTGATGGA GATATCGGAG ATAGCCATGT TGGTTTGCAG GCTCGTATGA 
TGAGCCAGGC CATGCGTAAA CTTGGCGCCT CTATCAATAA AACCAAAACA ATTGCCATTT 
TTATCAACCA ATTGCGTGAA AAAGTTGGAG TGATGTTTGG AAATCCAGAA ACAACACCGG 
GCGGACGTGC TTTGAAATTC TATGCTTCAG TCCGCTTGGA TGTTCGTGGT AATACACAAA 
TTAAGGGAAC TGGTGACCAA AAAGAAACCA ATGTCGGTAA AGAAACTAAG ATTAAGGTTG 
TAAAAAATAA GGTAGCTCCA CCGTTTAAGG AAGCCGTAGT TGAAATTATG TACGGAGAAG 
GAATTTCTAA GACTGGTGAG CTTTTGAAGA TTGCAAGCGA TTTGGATATT ATCAAAAAAG 
CAGGGGCTTG GTATTCTTAC AAAGATGAAA AAATTGGGCA AGGTTCTGAG AATGCTAAGA 
AATACTTGGC AGAGCACCCA GAAATCTTTG ATGAAATTGA TAAGCAAGTC CGTTCTAAAT 
TTGGCTTGAT TGATGGAGAA GAAGTTTCAG AACAAGATAC TGAAAACAAA AAAGATGAGC 
CAAAGAAAGA AGAAGCAGTG AATGAAGAAG TTCCGCTTGA CTTAGGCGAT GAACTTGAAA 
TCGAAATTGA AGAATAAGCT GTTAAAGCAG TGGAGAAATC CGCTACTTTT TCGATTTTTG 
ATTCAAGTTT TTAGATTATA TATAGTAGCT TGAAATAAGA TATGAACAAC TCT ATTAGGA 
AAGTCAAATT AATTTCTAGA AATGTTTTAG CAGCTACAGC GTACTATTCC AAACTCAACC 
AACTATAATA GATCGAAACT AGAATAGTAC ATATCTACTT CTAAAACATT GTTAAAAATC 
GATTTGACTT TCCTTATTTC ATTCCGCTAT ATATAGTTTG CTGTTTCTTG TCGCTCCTCT 
GGAAAGCTGA TATAATAGCT TTATGAATAA AAAACGAACA GTGGACCTGA TACATGGTCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
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GATTCTTCCC TCGCTCTTAA GCTTCACCTT TCCAATTTTG CTATCAAATA TTTTTCAACA 1440 

GCTCTATAAC ACTGCTGATG TCTTGATTGT TGGACGATTT CTTGGTCAAG AATCCTTGGC 1500 

TGCAGTAGGA GCGACGACAG CGATTTTTGA CCTGATTGTA GGTTTTACAC TTGGTGTTGG 1560 

CAATGGCATG GGGATTGTCA TTGCTCGTTA TTATGGGGCT CGGAATTTCA CTAAAATCAA 1620 

GGAAGCAGTA GCAGCCACCT GGATTTTAGG TGCTCTTTTG AGCATTCTAG TTATGTTGCT 1680 

GGGCTTTCTT GGCTTGTATC CTCTCTTGCA ATACTTAGAT ACTCCTGCAG AAATTCTTCC 1740 

TCAATCTTAT CAATATATTT CTATGATTGT GACCTGTGTA GGTGTCAGCT TTGCTTATAA 1800 

TCTTTTTGCA GGCTTGTTGC GGTCTATTGG TGACAGTCTA GCAGCCCTGG GATTTCTGAT 1860 

TTTCTCTGCC TTGGTTAATG TGGTTCTGGA TCTCTATTTT ATTACGCAAT TGCATCTGGG 1920 

AGTTCAATCC GCAGGACTTG CTACCATTAT TTCGCAAGGT TTATCAGCGG TTCTCTGCTT 1980 

TTATTATATT CGTAAAAGTG TGCCAGAACT CTTGCCACAG TTTAAACATT TCAAATGGGA 2040 

CAAAAGCTTG TACGCGGATC TCTTGGAGCA AGGTTTGGCT ATGGGCTTGA TGAGTTCAAT 2100 

TGTATCTATC GGCAGTGTGA TTTTACAGTT TTCTGTTAAT ACATTTGGTG CAGTGATTAT 2160 

TAGTGCCCAG ACGGCAGCTC GACGCATTAT GACCTTTGCC CTTCTTCCTA TGACCGCTAT 2220 

TTCTGCATCA ATGACGACCT TTGCTTCTCA GAATCTAGGA GCTAAGCGAC CTGACCGTAT 2280 

TGTTCAAGGT CTTCGAATCG GCAGTCGTTT AAGTATATCC TGGGCAGTTT TTGTTTGTAT 2340 

TTTCCTCTTT TTTGCCAGTC CAGCTTTGGT TTCCTTCTTG GCTAGTTCGA CAGATGGTTA 2400 

CTTGATAGAA AATGGAAGTC TCTATCTGCA AATCAGTTCA ACCTTTTATC CCATTTTGAG 2460 

CCTCTTGTTG ATTTATCGCA ATTGCTTGCA GGGCTTGGGG CAAAAGATCC TTCCTCTAGT 2520 

TTCTAGCTTT ATTGAACTAA TCGGAAAAAT CGTTTTTGTG GTTTTGATTA TTCCTTGGGC 2580 

AGGATATAAG GGTGTTATCC TTTGTGAACC TCTTATCTGG GTTGCCATGA CAGTTCAACT 2640 

GTACTTCTCA TTATTCCGTC ATCCCTTGAT AAAAGAAGGC AAGGCAATCT TGGCAACCAA 2700 

AGTGCAATCC TAGTTGGATT TACTGAATAA AATCCATTTC CTCTAGTGAA AATCGAAAAA 2760 

ACTTGTGTTC TCTTCTTTAG TTTGGTGTTG AAAATAGTTT AACAGACTTT TGACTTCTTT 2820 

TATATGATAT AATAAAGTAT AGTATTTATG AAAAGGACAT ATAGAGACTG TAAAAATATA 2880 

CTTTTGAAAA TCTTTTTAGT CTGGGGTGTT ATTGTAGATA GAATGCAGAC CTTGTCAGTC 2940 

CTATTTACAG TGTCAAAATA GTGCGTTTTG AAGTTCTATC TACAAGCCTA ATCGTGACTA 3000 

AGATTGTCTT CTTTGTAAGG TAGAAATAAA GGAGTTTCTG GTTCTGGATT GTAAAAAATG 3060 

AGTTGTTTTA ATTGATAAGG AGTAGAATAT GGAAATTAAT GTGAGTAAAT TAAGAACAGA 3120 
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TTTGCCTCAA GTCGGCGTGC AACCATATAG GCAAGTACAC GCACACTCAA CTGGGAATCC 3180 

GCATTCAACC GTACAGAATG AAGCGGATTA TCACTGGCGG AAAGACCCAG AATTAGGTTT 3240 

TTTCTCGCAC ATTGTTGGGA ACGGTTGCAT CATGCAGGTA GGACCTGTTG ATAATGGTGC 3300 

CTGGGACGTT GGGGGCGGTT GGAATGCTGA GACCTATGCA GCGGTTGAAC TGATTGAAAG 3360 

CCATTCAACC AAAGAAGAGT TCATGACGGA CTACCGCCTT TATATCGAAC TCTTACGCAA 3420 

TCTAGCAGAT GAAGCAGGTT TGCCGAAAAC GCTTGATACA GGGAGTTTAG CTGGAATTAA 3480 

AACGCACGAG TATTGCACGA ATAACCAACC AAACAACCAC TCAGACCACG TTGACCCTTA 3540 

TCCATATCTT GCTAAATGGG GCATTAGCCG TGAGCAGTTT AAGCATGATA TTGAGAACGG 3600 

CTTGACGATT GAAACAGGCT GGCAGAAGAA TGACACTGGC TACTGGTACG TACATTCAGA 3660 

CGGCTCTTAT CCAAAAGACA AGTTTGAGAA AATCAATGGC ACTTGGTACT ACTTTGACAG 3720 

TTCAGGCTAT ATGCTTGCAG ACCGCTGGAG GAAGCACACA GACGGCAACT GGTACTGGTT 3780 

CGACAACTCA GGCGAAATGG CTACAGGCTG GAAGAAAATC GCTGATAAGT GGTACTATTT 3840 

CAACGAAGAA GGTGCCATGA AGACAGGCTG GGTCAAGTAC AAGGACACTT GGTACTACTT 3900 

AGACGCTAAA GAAGGCGCCA TGGTATCAAA TGCCTTTATC CAGTCAGCGG ACGGAACAGG 3960 

CTGGTACTAC CTCAAACCAG ACGGAACACT GGCAGACAAG CCAGAATTCA CAGTAGAGCC 4020 

AGATGGCTTG ATTACAGTAA AATAATAATG GAATGTCTTT CAAATCAGAA CAGCGCATAT 4080 

TATTAGGTCT TGAAAAAGCT TAATAGTATG CGTTTTCTTG TGGAGATATT TCCTTCAATT 4140 

TTGCTACTAT ATTAAACAAA AATCAAAAAG CAAACTAGAA AGTTATGCTC AAATAAAATC 4200 

TAAATTTGAC AATGTAAACC GAGTCGGATA GCTTTAAGTA CTGTTTTGAG GTTGAAGATA 4260 

CGATTTTTGA TAGGAACTCA TCAATTTTAG ATTTTTAAGC AGCATCAATA AATTGCTTCC 4320 

TTGTTTTGTC ATAATTTTTT TATTTAAAAA ATTATGACma GAGTGTGCTA TTCTTTTTAT 4380 

GAGAGGTGTA TGAATATGAT AAATGTATGT GATAAATGTA TGTGATGTTG GAAAAAGAAT 4440 

AAAAGAACTT AGAATATCTT CAAATCTTAC TCAAGATAAG ATTGCTGAGT ATTTGTCTTT 4500 

GAATCAAAGC ATGATTGCCA AAATGGAAAA AGGTGAAAGG AATATCACGA ATGGATTTAA 4560 

GTAATAAAGC TTCAAATCTT AGAAAAAAGT TGGGAGCTGA TGGTGAATCG CCGATAGATA 4620 

TTTTTAAATT GGTACAAAAG ATAGAAAATT TGACGCTGGT ATTTTATGGA CTCGGAAAGA 4680 

ATATTAGCGG AGTCTGTTAT AAAGGAACTC AGTTCAGTCT CATTGCAGTC AATTCAGACA 4740 

TGCCATTAGG AAGGTAAAGA TTTTCTTTAG CACATGGACT GTATCATCTT TATTATGATG 4800 

AGGTGAAGAA GAGTTCAGTC AGTCTTATCT TGATTGGTGA AGGAGATGAA ACTGAAAGAA 4860 

AAGCGGATCA GTTTGCTTCT TATTTTTTAA TTTTCCCATC TTCACTGTAT AGGATGGTTG 4920 
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AGGAAATCAG AGAAAATGCC 


AATAGAACTC 


ATCTTGAAGT 


AGAAGATATT 


ATAAAATTGG 


4980 


GTCAGTTTTA TGGTATCAGT 


CATAAAGCTA 


TGTTATATAG 


ATTGAGGAAT 


GATGGATACC 


5040 


TTGATGCAGA AGAAATTAAA 


AATATGGATA 


TTAGTGTTAT 


AGAGACAGCT 


TCAAGATTAG 


5100 


GCTATGATAC AAGTTTATAT CGTCCTTTGT CAGAAAGTAA AAAAGAAATG GCATTAGGAT 


5160 


AATATATTAA TTCAACTGAA CAACTTTTAG 


AAAATAACAG 


AATTTCGCAA 


GGGAAGTATG 


5220 


AGGAACTGTT ACTAGATGCT TTCAGATATG ATATTGTATA TGGGGTAGAT GAAGAGGGGG 


5280 


GAGTTGTCGT TTGACTAGTC 


GTGTATTTAT 


TGATGCAGAT 


TGTATTTCAG 


TATTTTTATG 


5340 


GGTTGGCACT GAACATCTTT 


TAGAAAAGCT 


CTATTTGGGT 


AAAATTGTTA 


TTCCACAAGA 


5400 


GGTGTATGAT GAAATCAATA 


TACCTACAAT 


TCCCCATTTA 


AAATCTAGGA 


TAGATCAGTT 


5460 


GGTAGCTAAG GGTTCAGCTG 


AGATTGTGAG 


CATAGACATT 


GGAACTGAAG 


AATACGCATT 


5520 


ATATAGAGAT TTAACAAGAA 


ATCATGATAG 


TAACAAGATT 


ATTGGTAAGG 


GAGAAGGGGC 


5580 


ATCTATTTCC TTAGCGAAAA 


AGCATAATGG 


GATATTAGGA 


AGTAATAACC 


TAAGAGATGT 


5640 


TAAATCATAT GTAGAAGAAT 


TTTCTTTAGA 


ATATATGACA 


ACAGGAGATA 


TACTGATTGA 


5700 


AGCGTTTAAA GCGTAATTTA 


TTACTGAATA 


AGAGGGCAAT 


CATATCTGGA 


ATAATATGCT 


5760 


TAAAAAGAGA AGGAAAATTG 


GTGCAAATTC 


ATTTTCAGAC 


TATCTTCGTG 


GAAGTATTCA 


5820 


TCAAAATAGA CAAAAATAAA 


TTTGGATAAA 


TCGAACTCAC 


TATTCAGGAG 


GCATATGAGC - 


5880 


AATTCGAAAA AGAAAAGTGT 


CAAATTGAGC 


CTATAGGAGT AGAAGTGAAA TAGTAAGTCC 


5940 


TGCATAGTGG ATGAGAGAAA AGTTCTCCTT 


GAAGTTTTCC 


TGAACTATCA 


GTCGCATGTC 


6000 


AAACGATATG TAGGGTAATG 


TGAGAGGGGA 


TAGCGAGTAG 


TTTTTGGTTA 


TTTTATCAAA 


6060 


AAACTTATAT TTTATTATAC 


CGAATGATAA 


AATATAATAA 


AAATGATAGA 


ATAAGGAAAA 


6120 


AACATGAATG TCAAAAAGAT 


AATGTCAATT 


TTTCAATCCT 


TTTATGTTGA 


TGTCAGTATT 


6180 


GAGGAACTGA CTTTGACTTT 


ACCAATCAGT 


TTTGTAAAAA 


GGTTTGAGTA 


TACTCAAATG 


6240 


ACTTTTCATA AGGAATCATT 


TTTATTGATT 


AAAGAAAAGA 


GAAGGGGGAG 


TTTGAGTTCA 


6300 


TTTGTTACTC AGGCTCGCAC 


TATGGGTGAA 


AAAGCCAATA 


TGGATGTTGT 


TTTGGTGTTT 


6360 


TCGAAGTTAT CAGACAGTGA 


AAAAAAGCAA 


TTACTTCAAG 


CTAGAGTTCC 


GTTTGTAGAC 


6420 


TTTAAGGGAA ACCTCTTCTT 


CCCTCCATTG 


GGACTAGTAC 


TCAATGCGAA 


TGATACTGAA 


6480 


GTCCCTAAGG AATTAACACC 


TAGCGAACAA 


TTAACGTGGA 


TTGCCTTTTT 


ATTGACAAAA 


6540 


GGTCAAAAAG TAGTAGATGT 


TGATTTGCTT 


TGACAAGTCA 


CTGGACTTCG 


AAACTGAACA 


6600 


ATTTATAGGT GTTTGAGGAC 


TTTTAAAGCT 


TTATATTGGT 


TAAACAAGCA 


AAATAAGCTT 


6660 
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TACACATATA CGGTGTCAAA GAAAGAATTA TTCTTAAAAT CCGTGTCATG TTTATTTAAT 6720 

CCCATCAAAA AACGGATTTT ATTGCCAGAT GGCGATATAA AGCAGATAAA ATCTGTTTCT 6780 

AACCTTCTAT ATGGTGGTGC TTATGCTTTG TCGCATTCAA CTTTTTTAGC TGAAACGGAT 6840 

GAAAATATTA GCTATGTCAT ATGGCAGAGA AAATTCAATC AGTTATCCTT GCCACTTTCT 6900 

CAGCATGTTT TAAAATGAAA GATGCTAGAG ATATGGAAAT ATCGTCCTTT TGTATCTGAG 6960 

TTTTGGAATG ATTTTAAAAA TAATCATGAT AAACAATTTG TAGATCCGAT TTCTCTTTAT 7020 

TTGACCTTAA AAGATGATGA TGACCCACGT ATAGAGGAAG AGAGTGAAGC ACTAGAAAAT 7080 

ATGATATTAC AGTATCTGGG AGAAGATGAT GCCAGCTAAT ACGAAAGTTA TTTTTCAAGA 7140 

AATGTTTGCG GATTTTCAGA ACTATTATGT TCTGATTGGG GGAACTGCTA CCTCTATCGT 7200 

ATTGGATTCG CAAGGATTTA AAAGTCGCAC AACAAAAGAT TATGATATGG TCATCATTGA 7260 

TGAAGTAAAA AATAAGGAAT TTTATACTAC CTTGAATCAT TTTTTAGAAT TGGGAGAGTA 7320 

TCAAGGAAGT CAGAAAGATG AGAAAGCGCA GCTTTTTCGA TTTACAACAA CTAATCCTGA 7380 

GTTTCCTTCT ATGATTGAAC TATTTAGTAT CTTACCAGAA TATCCATTAA AGAAGGACGG 7440 

TCGAGAAATT CCCTTACATT TTGACCAAGA TGCTAGTTTA 1*CAGCCTTAT TATTGGATGA 7500 

AGATTATTAT AATATATTGG TGCATGAAAA AGAAACCATT CAGGGGTATT CGGTATTGAG 7560 

TAATTGTGGT TTATACTCTT CGAAAATCTC TTCAAACCAC GTCAGCTTCC ATCTACAACC 7620 

TCAAAACAGT GTTTTGAGCA GCCTGCAGCT AGCTTCCTAG TTTGCTCTTT GATTTTCATT 7680 

GAGTATTAAT TATTTTTAAG GCTAAAGCTT GGCTGGATAT GAGGGAGCGC TCTGCCACAG 7740 

GTGCTCAAGG TTTAAGTAAG TCCATTAAAA AGCATTTGAA TGACCTTACC CGTTTGACAG 7800 

CTTCCTTGCT AGGAGATGAA AAGTTATCGG CTATAACATC AAGTAGTGCG GTAAAAGCAG 7860 

ACATGCACCG CTTTGTGATA GAATTAGAGC CTGTGAAGTC AACTATTCTT CAAAATAATG 7920 

ACATTTCATT GGATCAAAAT GAAATTTTTG AAATTCTGAA AAATTTTCTC GATGGTTAAA 7980 

ATAATTGTAG CGAGATGGCT ATATTGAATT CGTCTATATC TGGAAACTAG AAAAAACTTC 8040 

AATTTCAGGA GAAAATGAAG TCAATCTTCC CACAATCAAA CGTATAGTAT CAAGGTTTTT 8100 

CAAGACCTGA TATTATGCGT TTTTTGCTTT TCAAAACTTT TTGCCCAGTC TTCGTTTTTA 8160 

TCCTCTAGTC ACTTGATTTG TTTCAGGTGG TTTTTTAGTA TAGTAGAATG AAACGAGAAC 8220 

AGGACAAATT GATCAGGACA GTCAAATCGA TTTCTAACAA TGTTTTAGAA GCAGAAGTGT 8280 

ACTATTCTAG TTTCAATCTA CTATAGTTAA ATCTGCGGTC AAGTCTACTG GTGAATCTAT 8340 

GATTGTAATA CTCTTCCAAA ATCTCATCAA CCACGTCAGT CTTGCCTTGC AGTCTGTATC 8400 

TTACTGACCA AGCTAGTGAT GGATTTAGAA TAGGTGATTT GGAGCGTCCT ATTAGCTAGG 8460 
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AAATGCTGCT CATAGTCCTT TGCTGAGGCT AGGGTGTTTC AACATTCAAC ACTCAACTGG 


8520 


TTGATCTAGT TGATAGGAAG GGAGTTACTA TAAAATACTC AGGCTTCCAT 


CATATTTTTT 


8580 


GAAACGATTG TGTAATCAAA 


ATGTACCAAT ATTGTAGTAT 


TGGTACAGAA 


GATGTTGTGA 


8640 


ATGGATAAAT ATATCATAAC TGCTATCTCA AAAAGATTTC ATATGTCTGT 


GCATATATAA 


8700 


TAGACTTCCT GCAAAACTAG 


AATCCTAGTT CATGATTGAT 


AATACCAGCA 


ATCAAATTCA 


8760 


TTCGTAATCC AAAGCGTTTA 


CGATGATTTC GATAGGTTGT 


TGAAAACATT 


TTAAACGTTT 


8820 


CTACTTTGGC AAAGATGTTC 


TCAACCTTGC TTCTCTCCTT 


AGATAGCGCA 


TGGTTATAGG 


8880 


CTTTATCTTC AGCTGTTAGC 


GGCTTGAGTT TGCTGGATTT 


ACGTGGAGTT 


TGTGCTTGAG 


8940 


GACATATCTT CATGAGCCCT 


TGATAACCAC TGTCAGCCAA 


GATTTTACCA 


GCTTGTCCGA 


9000 


TATTTCTGCA ACTCATTTTG AACAACTTCA TATCATGACA ATAGTTCACA GTGATATCCA 


9060 


AAGAAACAAT TCTCCCTTGA 


CTTGTGACAA TCGCTTGAGC 


CTTCATAGCG 


TGAAATTTCT 


9120 


TTTTACCAGA ATCATTCGCT 


AATTCTTTTT TTAGGGCGAT 


TGATTTTTAC 


TTCCGTCGCA 


9180 


TCAATCATTA CCGTGTCCTC AGAACTAAGA GGAGTTCTTG AAATCGTAAC 


ACCACTTTGA 


9240 


ACAAGAGTTA CTTCAACCCA 


TTGGCTCCGA CGGATTAAGT 


TGCTTTCGTG 


AATACCAAAA 


9300 


TCAGCCGCAA TTTCTTCATA 


AGTGCGGTAT TCTAGGCTTA 


ATTTAGGTTT 


TCGTCCACCT 


9360 


TTTGCGTGTT TAAGTTGATA 


AGCTGTTTTT AATACAGCTA 


ACATCTCTTT 


AAAAGTCGTG 


9420 


CGCTGAACAC CAACAAGACG 


CTTAAATCGT GTATCAGTTA 


ATTGTTTACT 


TGCTTCATAA 


9480 


TTTCGCAGGG AGTCTATTGA 


CTCTTTGGTA GGTGTCAATG 


TTTTTTTCAT 


CTATCCCGAG 


9540 


AATTATTTTC CCGCCATTTG 


TATTTGCAAA TGCTGAGTAG 


GTTTCCCAGA 


AAGACTCTGG 


9600 


AAGATTGTTT TTAGCTTTTT 


TGTATTCTAA ATCAACCCCT 


TCAAATTTTA 


AGTCCATATT 


9660 


TTTCCTTTAC ATCTGTTTTT 


TGTGGTTCTG GTATTTGTTC 


AAGTTGAGTG 


ATAATATAGC 


9720 


GAATTGAATT TCGAGAGTTT 


TTACTCAGTT AATTTCTTTT 


TTAACCCACT 


TTAATTGCTT 


9780 


TTTTAACACG GGTTAAAAAA GAAATTAAAG TGGGTTAATT TTTCTTGA 




9828 


(2) INFORMATION FOR SEQ ID NO: 42: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3369 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
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CCGCGAAAGA TATTTTTGAA CAAGAGTTTG GACGTGAGGT 


CCGTGGCTAT 


AATAAAGTAG 


60 


AAGTTGACGA 


GTTTTTAGAC GATGTCATCA 


AGGACTATGA 


AACCTATGCT 


GCCTTGGTCA 


120 


AGTCACTTCG 


TCAGGAAATT GCGGATTTGA 


AGGAAGAATT 


AACTCGTAAA 


CCGAAACCTT 


180 


CACCAGTTCA 


AGCAGAACCC CTTGAAGCGG 


CAATTACAAG 


TTCTATGACG 


AATTTTGATA 


240 


TTTTGAAACG 


CCTGAATAGA TTGGAAAAAG 


AAGTTTTTGG 


TAAACAAATT 


TTAGATAACT 


300 


CAGATTTTTA 


AGTAGTTATT TGAGATGTGC 


AATTTTTGGA 


TAATCGCGTG 


AGGAGAATTG 


360 


TTTCTCATGA 


GGAAAGTCCA TGCTAGCACA 


GGCTGTGATG 


CCTGTAGTGT 


TTGTGCTAGG 


420 


CGAAACCATA AGCCTAGGGA CGAGAAATCG TTACGGCAGT 


TGAAATGGCT 


AAGTCCTTGG 


480 


ATAGGCCAGA 


GTAGGCTTGA AAGTGCCACA 


GTGACGGAGT 


CTTTCTGGAA 


ACAGAGAGAG 


540 


TGGAACGCGG 


TAAACCCCTC AAGCTAGCAA 


CCCAAATTTT 


GGTCGGGGCA 


TGGAGTACGC 


600 


GGAAACGAAC 


GTAGTATTCT GACTGCTATC 


AGCTAGAGCT 


GTTAGTGGTA 


GACAGATGAT 


660 


TATCGAAGGA 


AGTGGTCCTA GTCACTTCTG 


GAACAAAACA 


TGGCTTATAG 


AAAATTGCAT 


720 


ATAGGTTGGG 


GCTGAGAAAT TTTCTCAACC 


TCATTTTTTA 


AAGTGGACAT 


ATAGAAAGGT 


780 


CTTGCAAGAC 


TGTAACATGA AAAAAGAATT 


TAATTTAATT 


GCAACTGTGG 


CAGCAGGGCT 


840 


TGAGGCTGTC 


GTTGGTCGTG AAGTGCGAGA 


GTTGGGCTAC 


GATTGTCAGG 


TTGAAAATGG 


900 


ACGTGTTCGT 


TTTCAAGGAG ACGTGAGAGC 


TATTATCGAA 


ACCAACCTTT 


GGCTTCGGGC 


960 


AGCAGATCGT 


ATCAAAATTA TCGTAGGAAC 


GTTCCCAGCT 


AAGACTTTTG 


AAGAGCTATT 


1020 


TCAGGGAGTT 


TTCGCTTTGG ATTGGGAAAA 


TTATTTACGA 


CTTGGAGCTC 


GGTTCCCGAT 


1080 


TTCAAAAGCT 


AAATGTGTTA AGTCCAAACT 


TCACAATGAG 


CCCAGTGTTC 


AGGCTATTTC 


1140 


TAAGAAAGCT 


GTTGTCAAGA AATTGCAGAA 


ACACTATGCT 


CGCCCAGAAG 


GGGTTCCTCT 


1200 


GATGGAGAAT 


GGCCCAGAGT TTAAGATTGA 


GGTCTCTATT 


CTCAAAGATG 


TGGCAACTGT 


1260 


CATGATTGAT 


ACGACCGGGT CTAGCCTCTT 


TAAACGTGGT 


TATCGTACCG 


AAAAAGGTGG 


1320 


CGCTCGTATC 


AAGGAAAATA TGGCAGCAGC 


CATTTTACAA 


CTTTCTAACT 


GGTATCCAGA 


1380 


CAAGCCTTTG 


ATTGATCCGA CCTGTGGTTC 


GGGGACTTTC 


TGTATTGAGG 


CAGTTATGAT 


1440 


TGCTAGAAAG 


ATGGCGCCAG GTCTTCGTCG 


CTCTTTTGCA 


TTTGAGGAAT 


GGAACTGGAT 


1500 


CAGCGATCGC 


TTGATTCAAG AAGTGCGCAC AGAAGCGGCT AAAAAAGTAG ACCGTGAGCT 


1560 


TGAGCTGGAT 


ATCATGGGCT GTGATATTGA 


TGCTCGCATG 


GTGGAAATTG 


CTAAGGCCAA 


1620 


TGCTCAGGTA 


GCTGGTGTTG CAGGAGACAT 


TACTTTTAAG 


CAGATGCGCG 


TGCAGGATTT 


1680 


ACGTTCCGAT 


AAAATCAATG GAGTAATCAT 


TTCCAATCCG 


CCTTATGGTG 


AACGTTTGTC 


1740 


AGATGATGCA 


GGGGTGACCA AGCTCTATGC 


TGAGATGGGG 


CAAGTATTTG 


CACCGCTGAA 


1800 
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AACTTGGAGC 


AAATTTATCC 


TGACTAGTGA 


TGAAGCTTTT 


GAAAGCAAGT 


ATGGTAGCCA 


. 1860 


AGCAGATAAG 


AAGCGTAAGT 


TATACAACGG 


AACCTTGAAA 


GTGGATCTAT 


ATCAATATTT 


1920 


TGGTCAGCGT 


GTCAAACGGC 


AAGAGGTAAA 


ATAGAAAGGG 


ATACTCATGA 


GTAAAAAAAG 


1980 


ACGAAATCGT 


CATAAAAAAG AAGGTCAAGA ACCGCAATTT 


GATTTTGATG 


AAGCAAAAGA 


2040 


GCTAACAGTT 


GGTCAAGCTA 


TTCGTAAAAA 


TGAAGAAGTG 


GAATCAGGAG 


TCTTGCCTGA 


2100 


GGATTCCATT 


TTGGACAAGT 


ATGTTAAGCA 


ACACAGAGAT 


GAAATTGAGG 


CGGATAAGTT 


2160 


TGCGACTCGT 


CAATACAAAA AAGAGGAGTT 


CGTTGAAACT 


CAGAGTCTGG 


ATGATTTAAT 


2220 


TCAAGAGATG 


CGTGAGGCTG 


TAGAGAAGTC 


AGAAGCTTCT 


TCGGAGGAAG 


TTCCATCTTC 


2280 


TGAAGACATC 


TTACTACCCT 


TGCCTCTGGA 


CGATGAGGAG 


CAAGGCTTGG 


ATCCTCTATT 


2340 


GCTAGATGAT 


GAAAATCCAA 


GAGAAATGAC 


TGAAGAAGTG 


GAAGAGGAGC 


AAAACCTTTC 


2400 


TCGTCTGGAT 


CAAGAGGACT 


CAGAAAAGAA AAGTAAAAAA 


GGCTTTATTT 


TGACCGTTTT 


2460 


GGCGCTTGTA 


TCAGTAATTA 


TTTGTGTCAG 


TGCTTATTAT 


GTCTACCGTC 


AAGTGGCTCG 


2520 


TTCGACTAAG 


GAAATTGAAA 


CTTCTCAATC 


AACTACAGCC 


AATCAATCGG 


ATGTGGATGA 


2580 


TTTTAATACA 


CTTTATGACG 


CCTTTTACAC 


AGATAGCAAT 


AAAACGGCTT 


TGAAAAATAG 


2640 


CCAGTTTGAT 


AAACTGAGTC 


AACTCAAGAC 


TTTACTTGAT 


AAGCTGGAAG 


GTAGTCGTGA 


2700 


ACATACGCTT 


GCCAAATCTA 


AATATGATAG 


TCTAGCAACG 


CAAATCAAGG 


CTATTCAAGA 


2760 


TGTCAATGCT 


CAATTTGAGA 


AACCAGCTAT 


TGTGGATGGT 


GTGTTGGATA 


CCAATGCCAA 


2820 


AGCCAAATCG 


GATGCTAAAT 


TTACGGATAT 


TAAAACTGGA 


AATACGGAGC 


TTGATAAAGT 


2880 


GCTAGATAAG 


GCTATCAGTC 


TTGGTAAGAG 


CCAGCAAACA 


AGTACTTCTA GCTCAAGTTC 


2940 


AAGTCAAACT 


AGCAGCTCAA GTTCAAGTCA AGCAAGTTCA 


AATACGACTA 


GTGAGCCAAA 


3000 


ACCAAGTAGT 


TCAAATGAGA 


CTAGAAGTAG 


TCGCAGTGAA 


GTCAATATGG 


GTCTCTCGAG 


3060 


TGCAGGGGTT 


GCTGTTCAAA 


GAAGTGCCAG 


TCGTGTTGCC 


TATAATCAGT 


CTGCTATTGA 


3120 


TGATAGTAAT 


AACTCTGCCT 


GGGATTTTGC 


GGATGGTGTC 


TTGGAACAAA TTCTAGCGAC 


3180 


TTCACGTTCA 


CGTGGCTATA 


TCACTGGAGA 


CCAATATATC 


CTTGAACGTG 


TCAATATCGT 


3240 


TAACGGCAAT 


GGTTATTACA ACCTCTACAA 


GCCAGATGGA 


ACCTATCTCT 


TTACCCTTAA 


3300 


CTGTAAGACA 


GGCTACTTTG 


TCGGAAATGG 


CGCTGGTCAT 


GCGGATGACT 


TAGATTACTA 


3360 


AGCAGTCGG 












3369 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9713 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

AAGTTTACAA TTTAAATGAA TTAACAATTT TCCCAACTAA AAGCACTCCA GTTACCGCAA 60 

CGTTTGTACT GAATGTACTA AATCGCATTC CATCAACTTC ATCTGTTTCG TCAACTTGAA 120 

CAGATACTAA TTGAAGATTT AATACTTCTG CTGCCATAGC TAGCTCCTCC TATTTAAATT 180 

TTTGGGATTA AGTACTTTAT CCACCCTCAT ATACTCTCTC CACCAGTAAA ATGCAAGCAA 240 

TGATACAAAA TAGATTTAAC TATTTTATAT AGCGAAAACT TACAAATTTT TAAGAAATAA 300 

TTTTTGCATT CTTAAAGATA AAATAGGAAC TTTTAGTAAT AAATATTAAA ATAAATAAAA 360 

TAATAGATAC TATAAAATTT GGAAGTATTA ACCCCAAAAG ATTCATATCA TCTATTAAAA 420 

TATCCTCTAA AGAGTAGTAT ATTAAAGCCA TAATTTTAAT GTTAAGTAAA AATGCAATTA 480 

ATGAAGTAAC AAATGTCAAA AATATAGCCT CACCAACTTT AATCTTAACC ATCTGGTAAT 540 

TAGAAGTTCC TAAAATTTCA AATTGCTGAA TCTCAATCCT TTCTTGATGC GATGACAAAA 600 

ATGCAATTGA AATAATATTT GCAAGTACTA TCAAAATTGG TGCTCCTACA TAGACAATAA 660 

ATGCTACTTT TAGCTCTAAA TCACTGTCAT CTTGAAATTG AGATAGTATA TTCTGAGAAA 720 

TCATTTGAAA ACTAGAAATT AGTAATATAG CTCCTGTAAT TGCAGCACTG AT AG ATT TT A 780 

TATAAGACTT ACAATATAGT AAATTCCACT TCGAAACAAT GAACATAAAA TTATTTCTAA 840 

ATATAATTAT AGAAAGTAGT TTGATAAAAC ATGACTGTAT AAAAGGAGAT AATTGATAAA 900 

TAATCACAAT ATCTAAGATT ACAATATTGA ATATTATCTG GGCCTTCGCT AAAATTGTGC 960 

TATCTTGGAA AATTTGTTGC AAAGAAAGCA ACCAGATAAC ACTAAAACCA GCCAATAGCA 1020 

GTATTCTTTT TACTATTGAA AGAACATGCC TTATTTTAGA ACTCTTCCTA TTTCTAATCT 1080 

TCTTGAACGT ATAAAAGCAA CCACTTAGAA AGGCTAAAAA TGAAATCAAC ACTACTGTAA 1140 

TGATACATCC AACAGCACTC GTTTGAAATT GGATATCAGG TAATATATTT TCCCCGAAAA 1200 

AGTATTGTAA AAAATAATAA TAATTTGACG TAACAAATAT AGAGCATAGA TATGCAATAA 1260 

AACTAATAAT CGAGGAAATG ATAAAAATCT GTCCCCCCAC AAGAAATGAT AGTTGAAGGC 1320 

GACTTGCTCC CAACACCTCC AGAAGTTCGT AATCATCTCT AAAAATTTCA ACCAACATAT 1380 

TTATTATGTT AGAGAGCACA AAGAATAATG TTACTCCTCC GAATACTATC GGAAACATAA 1440 

AAATTGGTTT AGGATCTGGA AGTCCGACAA ATACTTGCGA ATTATTCTCA ACATTAATTA 1500 

CCCCATTAAC AGCCAATCCC ATAACTAAAC TCGAAACAAA AATTACTGGT GAAACGCCTA 1560 
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ACCATTGTTT CTTATTATGT AAAAATTGAT AGTAAACTAA TCTGAGCATC TCTATTCCTC 1620 

CGTAGTTGAT TGTACCTCTA AGATTTTATA CAACTCTTCC CCGCTAGGTC TATGAAGTTC 16 SO 

TTTGAAAATT TTTCCATCTT TCAATATTAA TGCACGATCA GTTTTCGAGG CCAATTCTAT 1740 

ATCGTGCGTT ACCATAATTA CACACTTACC CGCCCCTACT AACTCTCTCA ATAATTCAAA 1800 

AATTACTTCA CGAGAAACGC TGTCTAAAGC CCCAGTTGGC TCATCAGCAA ATATTATATC I860 

ACTATCAGCA ATAACCGCTC TAGCTATAGC AACCTTCTGT TGTTCTCCAC CAGACAGAGT 1920 

TCCAACAAAA TCGTTTAAGC CAGCATTAAA CTTCATTCTT TTGAGTAAGT TTTCTACATT 1980 

TTTAATAGTT AATTTTTTTT GTGATAATCG CAAAGGAAGT GCTATATTTT CTATTACCGG 2040 

CAGGGAAGGT ATTAAATTGT ATGCTTGAAA TATAAAAGAT ACTTCGTTAC GTCTTATACT 2100 

TGACAATTTT GCATTTCTGA TTTTATAGGG GTTGATTCCA TTTAAAATTA CTTCCCCACT 2160 

TGTTGGTTCA AGCAAACTAG AAATACATTT TAATAAAGTT GACTTTCCAG AACCACTAAT 2220 

TCCTAGAATA CTTATAAATT CTCCTCTCGA AGCAGAAAGA GAAACATTTT TCAGCACTTG 2280 

CAACGTTTTA TTATTTCCTA GTAAAAATTG ATGATACAGC CCTTTCACTT TTAATATATA 2340 

ATCTTTATCC ATATTCTTGC CTCCAATCAC TTAATTTTGA AAAGTGTTCC ATTTXCCAAT 2400 

TTATATATAT CAGTGTATCT CTTGTCATTT AAGTCATAAT GATGTGAAAC TTCAATAAAT 2460 

GAAATAGCTA AATTGAACAG AATATCATGT ATGGAATTTG AATTATCATT ATCTAAATTA 2520 

GCTGATATTT CGTCAAATAA GTACACTTTA TTATTTCTAA TCAGAGCTCT AGCTAAAGCT 2580 

ATTTTTTGTT TTTGACCTCC AGACAAATTA CTACCATTTT CACGACATTG ATAATTTAGT 2640 

ATATCTATCT TTTCTAATTC TTCATATAGA TTTACCTTTT TTAACACCTC AATTATGTGA 2700 

TCATCTGAAA AATATTCATT TTGAAATAAA GTTACGTTCT CACGAATAGT AGTGTCAAAA 2760 

ATATATGGTG TCTGATCAAC TGTTGGTATT GAATCTGAAC TCTTTTTCCC ATGTGATAAC 2820 

AAATTTACAT AACCTTTTTG TGGCTTTAAA GAACCATTAA TTAAATTTAA AATCGTTGTT 2880 

TTCCCACTAC CAGAAGTTCC TGTTAATAAT ACCCTAAATG GTGACTTAAA TGAGAAGTCA 2940 

ATACTTAATT TATTTTCTGG TGTAATAGAA TATACAACAT CTTTCATGTG TATCTCATCT 3000 

ATTGATGAAG TATACAGTCC GTTATTATCA TGTTCAGCGT CTATAAAATT CTTCTCTCCA 3060 

CTTAAGTATT TTAAAAACGG TTTCCTTAAA TCTTTGGTTG TATTTATCTT ATTTAATGAA 3120 

TAGGCAATTG ATTGTATCGG CCCTAAAACT TTATCGTTTG CTAAGAAAAT ACCTATGAGT 3180 

TCACTAAAAG AAAGGCTTTT ATGATAAATT ACAAAATAAC ATGCTACAAC CAAGGGAACT 3240 

AGAAAGCAAA AACCTGAAAT TAGTACTGCA ACCAATTTTG AAAGAACCTC TGATCGTTTC 3300 
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AAATTAAAAG TAGAATCTTC 


TAGTTTATCC AACTTTTTAT 


CCGACAAACT 


AATTATTTCT 


3360 


TTAGTAACAG AATAAGATTT 


TAATGTCTTA 


AAACCATTAA 


AAATTTCTTT 


TATTATGTGA 


3420 


GTATACTCTG CATTGCTGTT 


AGAGTACTCA 


TTAGCTGAAT 


TAGACAACAT 


CTTCTTCATA 


3460 


AAGACAGGTA CTATAATCGG 


CAATGCTGAT 


AATACAATAA 


ATATTATTGA 


nACTAGGAAG 


3540 


TTTAAATAAA GCATAAAACT TAGAGAGACG ATGAACAACA ATATTGAAGA 


AATTATTTCA 


3600 


AAAATTTGTC TAAAATAGTT TTCTTCGATT ■ AATCTCAAAT 


CATTTGACAA 


AACTGAAATA 


3660 


ATAGATGAGT AATCTTTAAC CATTTCAGAA GAAAGATACT GTTCTCTAAA ATATCCTTGT 


3720 


TTAATTTTTA CATTTATATC 


TTTAGTTATT 


GATGCTTCCG 


TTACTTCTAA 


ATAGTAATTT 


3780 


GATATATAGA TTGCTGACCA ACCCAGAATA CTTATAGCAC 


CAAATCTTAG 


AACGTCAGAA 


3840 


AATGAGGAAG TCTGATTTAA 


ACTACCTGCA 


TATACAATAA 


TTCCTGAGAG 


CAAGACACCA 


3900 


TTAAACGAAG ATAGAAATAT 


TAAAATCCCC 


ATTAATATAA 


GTTTAGTCTT 


TTTTATAAAT 


3960 


TTTAAATAAT TCATAAGTTA 


TTCCTTCCCA 


CTTCTTCAAA 


GAAATAATTT 


AAAGTATCAA 


4020 


TCATTAAGAG AACATCTGAT 


GGAGTAAAAC 


.CTCCATGACC 


AGCTGCTTTG 


TTTAAATACA 


4080 


ACAAACTTTT AACTCCAATA 


GAATTTAATT 


TCTTTGACCA 


CTCTATCACT 


TCGTTATTAT 


4140 


TAATATATGG GTCTTTCTCA 


CCCAAAATAT 


TAACTATAAC 


AGTATTTGAG 


TCTCGTGCCT 


4200 


TTTCAATATT TTGCATAGGC 


GAATATGACT 


TTATATAAGC 


CTTTACTTCA 


GGGTCTCTAA 


4260 


TATCTCCCCA CTCTGCTATT 


TCGGTCTTAG 


AAAGAGGATC 


ATTTGGATTC 


TGAAGTGTAT 


4320 


CATAAGGATT TATAAATGGC 


GAAAATAAGA 


GAATGCTTTG 


CAATAAATTT 


TTTTCCTCGT 


4380 


TCAACACCGC ACCAGCAATT ATTCCACCTG CACTAGAAGT TATTAAACCT AATCGCTTAC 


4440 


TGTCAATTAC ATCATTTTCC 


CTTAAATAAT 


TTACTCCCTC 


AATAAAATCT 


CTGATAGAAT 


4500 


TCCATTTGTT TAACGCCTTT 


CCTGAGCGAT 


ACCATTCACC 


ACCCAAATAG 


CCTCCACCTC 


4560 


TTACATGAAC TATAGCATAA 


ATAAAACCTG 


CATCTATTAT 


AGATAACATA 


ATTTCATCTA 


4620 


AATCAGAATT ATCATTCTTA 


CCATAAGCCC 


CATAGACACT 


TAGAATACAT 


TTTTTTCTTC 


4680 


TTGGGAGCTC ATCCGTATCT 


TCACTTTTCC 


AAAATAAAGA 


AATCGGTATG 


CTTACATCAT 


4740 


AACTGTCTTT TTTAGTCCAA 


ATCACCTTAG 


AAAAATATTT 


AGTATTATTC 


GATTTTATGA 


4800 


TGGGTCTTTC AAATTCAGTT 


TTTAATGTAT 


TTTCTATTAA 


ATCAAAACTA 


AGTATTTTTT 


4860 


CGTAAAAAGT TCTCCTCTCT 


AAAAACAGAA 


GAACAOGATC 


AGAAAATGAA 


TTTTCATAAA 


4920 


GTGTTGTCTT TTCATCAAAT 


GTTATCTTAT 


TAACACTCAA 


CTCCCTCAAA 


CTATTATTTT 


4980 


TAAATGTAGC AAGATAAAAG 


ACGGAATTCG 


CTGCGTTTGA 


ACAGTCTAAA 


AGGATATAAC 


5040 


GTCCTATACA GTGAACTCTT 


CTAGCCCTAT 


CTTGATATGG 


TATAGTAATA 


GAAACTCTGT 


5100 
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CTCCCGAAGA AGTTTCCCTT AGAATTAGTT GATCTTTCTT TTCTTCAGTT GAAGAGAGCC 5160 

CAAGAAAGTA CTGTGCTTTT TCTGTACTAA ATAGAGCGAT ATCTCTAGGT GTTGGGGCTA 5220 

CCGTTTCTGT GTAAGAGTGT CTAACAAAAC CCGTCCGGTC GAAACTGTAT AGAAAAATCC 5280 

TGCCTTTCTG AAAGTCTACT GACTTTACAA AACAATTATT GCTATCAATG TGGACTATTT 5340 

TTAATCGAAA AGAGCATTCG TTTTCTTCAA ACAGTTCCTC TTCTGTAAAG CTATCAAAAG 5400 

ATTTATAGAA TAACTTACTT GGCCTCCCGT ACTCTTTGGA GCGAGTATAC ATAACACCGA 5460 

ATTTACCCAA ATAGAACGAA CTTTCTACTG AAATATCTTC AATGATAAAT AACTCTTCCA 5520 

TAGTATATTT TTTTATTCCA ATTAAATTAG TCGTACGCAG TGAGGATACA ACCAAAACTA 5580 

TATAACTCTC ATCAGATGAA ATCCTAACAT CCTGTAAGAT ACTATCATCT GGCAAAGTAT 5640 

ATTTTTCCAC ATCAAAGACA ATTTTAAGTG AATTTGAATT GTCTAAACTG GAAGAACTAA 5700 

CCTTAGGAAT CCAGTCATTA TCTTCGACAT ACCATTCCTT TATTACACCA GTATTGGGTA 5760 

TACTCCAATT ATCAAATTGG TACCAATATC GCCCTCTCCT AAATATCAAA GAATTCCATT 5820 

TTTTTAATTC CTGAAATGAT GAAGAGATAG ACCTCTTATA GTGTGTTTTT TCCTGTATTG 5880 

TATTTAAAAA TATTTCATTA CTCTGATTCA CAAGTATGAC CCCTTAATAA TGGTATCTAA 5940 

AT ATT AT ATT TGAGGAAGAA TCGTCAATTT ATTATCCATT ATTGATACCA ATCCAATTGC 6000 

AACACCCGCA AATCCCGAAG CAATATCTGT TGTTATCTTT AAACCATTAT CTCCCGCAAT 6060 

AACAAATCCT TCTTCAATTA CACACAAATA TCTATAAAGT TGTTGAATTA ATTTCTTTTG 6120 

TCCTGAAAAG TTATCATCGA TATCACTATA TATATTATTA GCAACTTCAA GACCACAAAA 6180 

TCCGTTAAAT AAACCTGGTA ATACACAAAA AACTACATCA GTTGCCCTCT CTAAAGAAGT 6240 

TAAATATTTT AAGTATTTGC TTGACAAGAT TTCTTTATTT CTATTAATAA GTAAAAGCAG 6300 

GCCAGCACTT CCAGTTGCTA GATATGGTAG TAATCTATGA CCTTGGCTGT ACTGCAATGA 6360 

ATTATTACTA TCTACTTTAT AAGCAACTAA TTCTTTATCT ACAGGCAATT CTAGACCATT 6420 

TT TAT AG AT A CTTTCACCAG TTAATTTATA AGCTTCACCG AAGAGCCAAG CTACCCCTGC 6480 

GTGACCATAT AGTAATCCAC CAAAATTCTC ATAAGGATCG TTACTCTGAA CATCACTAGC 6540 

GCCAACTTTA CAAAAAGTTT CTGGATTTTC TATATAATTT AAAGTATATT CTCTAAGCCT 6600 

AATTAGTATT TCTTCTCCTA GTTTATTATC AATTCCCCCT TTACTAAGAA AATACAGTCC 6660 

AACCAGTAAA ATTCCAGCCT GCCCACTATA TAAATTTTTA TTTTGTGAAT TCTCAAATAT 6720 

CTCTATAAAA TGAGTTGTAA AAAGTTCAAC TGCCCGATCT ATCTCCCCAA ATTCATAAAT 6780 

GAGCCAGATT GTACCAATTT TACCATCAAA AAGACCAGAA AGGGACGATT TCTTAAAATT 6840 
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ATTTACTGCC TCATTAATAA 


CCTGTGTTCG 


AATCTCATAA 


TAGTCATCAA 


ACTTGAAATT 


6900 


TTTTACTTTC 


TTAGCTAGTT 


GTTGATAACT 


CCAAAGGATA 


GCTAAATCTG 


AAAACGCAAT 


6960 


TCCTTGATTA 


AAATTCAGAC 


CATAATAATG 


AACTGGGAAG 


AATCTTGATT 


GAAATTCTTT 


7020 


ACGCCACTGT 


CCATAAGTTA 


GCGTAAACCC 


TCTCAATAAT 


TTTATAATAA 


AATCTTGTAT 


7080 


ATCTTGCTCA 


CTCTCGATAG 


TTCTAATCTC 


ATGCATGGGT 


TTTAAAACTT 


TTTTCCTGGA 


7140 


AATATTCTCA 


ATCTGTGGAC 


ATTTAGAATC 


TAGATATGAC 


AATAAACTTT 


CTACATAATC 


7200 


TATATGTTCT 


CTTGTATAAC 


CCAAAGACTC 


AAATAGTTTT 


TTTCCTTCTA 


TCCTGGTTTG 


7260 


ACTTACATAG TTGTATGTCA AATCCGATGT AGTTACTAGT GGCATGTATA AATAATGAGC 


7320 


TATTTGTCTA ATACCATACC AATCTATCTC ACTGGGAAGT GTTTCTCGCC 


ATGCTCTAAA 


7380 


ACCAGGGGCT 


GCAACTTTAT 


GTACAACTTT 


TTCATCATTT 


GAAAAGACAG 


CCTGTTCCCA 


7440 


GTCTATTATA 


CTAATCTCAT 


CTTCATCCTT 


AACCAAGATA 


TTTCCTAAAT 


GTAAATCTTG 


7500 


ATGATATACA 


TTTTCAGAAT 


GAAACTTATT 


CGTTAAATCG 


ATGAGTTTTT 


CTACTATCTT 


7560 


TGAAACTCTC 


AATAGATAAT 


CTTTGGTCTT 


ATCAACAACT 


TCATATAAAG 


GAAAATTATT 


7620 


GGTAACCCAT 


CTATTTAGTG 


GAACGCCCTT 


CATATGTTCA 


ATTCCTAAGA 


AGGTGTGCTC 


7680 


CCAGATCTTA 


CCGTGCCAGT 


ATATTTTAGG 


CGTCTCACTC 


CATTCATTTA 


GAATTTTTAG 


7740 


TGCTTTGCAC 


TCCGAAGCTA ATTTCTCTGA AGAATAAGTA CCATCAAATC CTAGACCTGT 


7800 


ATACGGTCTA GCCTCTTTTA AAATTATTTT 


TTTCCCATCT 


TCTTTTAGCC 


TAGCATTATA 


7860 


TATCCCACCA 


CTGTTTGAAA 


ATCTAATTGC 


ATTATCTATA 


ATAAAGGGAA 


AGTCTCCCTG 


7920 


TTTTTTATCT 


TTCTTGTCAA 


GCCATTTATT 


CAAAAAGTCA 


GGGGGCACTA 


TACCTTTTGG 


7980 


AATTTTAAAT 


ACTGGTAAAC 


GTTCATCTTT 


AACAACTTCA 


TCGCCAACAA 


TTAATTCATC 


8040 


AATAGCAACC 


TTCTTTTCAT 


CATCCCTTGA 


CGGCCTAAAC 


ACACCATACC 


TCAGAT ATAT 


8100 


TGGTGCTTCA 


TCCCAACGTT 


TATCGCTTAA 


AATATATGGC 


CCATTATATT 


GCTTTAAGGC 


8160 


ACTTTCTAAC 


CTTTGCAAAA 


CCGACTCTAA 


TTCATTTTGA 


TTTGGATAAC 


ATGTAATAAA 


8220 


TTTACCAGAA 


AATCCTCGAC 


TAACCAATTT 


CCCGTTTCGC 


ATGATAAATT 


TGTCTTCTGT 


8280 


ACTAAGATGT 


TTAAATGGAA 


TTCGCATTTC 


ATGGCAAATT 


TTTGCTACAT 


CTTGTAACAA 


8340 


TTCATGTGAA 


CTGTTATACT 


CTGAACTAAT 


GTGTATTTTC 


CACCCTTGTC 


TTTCAACAAA 


8400 


TTTTCCAATA 


GGGTATTGAT 


AAACCCACTC 


ATCATTATTC 


ATTACTTCGT 


GCCAATTAAA 


8460 


AGGCAGACTT 


ACTTGGTACT 


TTATGCTAGT 


ATCTGTACTA 


TAATCATTAT 


TAGTGAAAAA 


8520 


GAAAGGATGC 


TCCAAATTGA 


AATTATAATC 


CATAACAAAA 


TCTCCAAGAA 


ATTTTATCAA 


8580 


ACTTAATATA 


TCTATAGCTA 


GACAGACTTA 


TTTAAATAAA 


AAGGGAGAAT 


CCTTTGGATT 


8640 
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CTCCCCATAT 


AAGCACTAAC 


ATTCCAACGT 


GCACATATTG 


GAACGACATC 


CATAACTCCA 


8700 


GAGAATCTCT 


AAAGTTTACA 


ATTTAAATGA 


ATTAACAATT 


TTCCCAACTA 


AAAGCACTCC 


8760 


AGTTACCGCA 


ACGATTTGTA 


CTGAATGTAC 


TAAATCGCAT 


TCCATCAACT 


TCATCTGTTT 


8820 


CGTCAACTTG 


AACAGATACT 


AATTGAAGAT 


TTAATACTTC 


TTCTGCCATA 


GCTAGCTCCT 


8880 


CCTATTTAAA 


TTTTTGGGAT 


TAAGTACTTT 


ATCCACCCTC 


ATTATACTCT 


CTCCACCAGT 


8940 


AAAATGCAAG 


CAATTATACA 


ATGTTGTCAC 


ATAGAAAATA 


ATGTTTCCGT 


AACTTTTGAA 


9000 


AGTAACTTCC 


ATCTCTCTCC 


CAAAACTGGA 


AGTTAGTTTT 


AGAAGTTACC 


TAAAAATCAG 


9060 


GTCACCTATT 


TTAAAAAAGC 


AGCAAACTAT 


AAACTAGTAG 


GTTCCACACC 


AAATGTAGTC 


9120 


CCATACTGCC 


CCATAAGTCA 


GATTTATAGC 


GCACCATACC 


TAAAAACATC 


CCAAGTGAAA 


9180 


CATACAAACA 


CCAAGCTAGA 


ATGGTTCCTG 


TATGATGTGC 


TAAGGCAAAT 


AAAACACTTG 


9240 


TCAAAGCAAC 


TCTGATATCT 


AATTTTCTGA 


CCAAATTCCA 


TAAAATTTCT 


CGATACAGAA 


9300 


ATTCTTCAAC 


CATACTCGCA 


TTGATTAAGA 


ACAATAAAAA 


TGAAAACCAA 


GGAATTTGAT 


9360 


GTTGAAGGCC 


AATTAAGTTT 


GCTTGATTCG 


TGCTTCCTTG 


AGCATGAATC 


AGACTAAAAC 


9420 


ATAGACTTAT 


AATCAGTAGG 


CTAACAAATT 


CAACACCAAG 


CCATTTCATC 


CTAGATTTCA 


9480 


TATTGACCTT 


ATGCGCTTGT 


TTGCGTTGGC 


CATACATCCA 


TAAAAAAGAA 


ATGAGTGACG 


9540 


AACCATAGAG 


AATCTGTAGT 


ATAGTTmACT 


CACCGATACA 


AAGAAATTTC 


AATAAGTATA 


9600 


GAGrTACCAA 


TAsGACATTT ACTTGTTGGA ATATATAAAC 


TGGAATTATT 


CTTTTCATAG 


9660 


TTACCTCCGA 


AATAAATCTT 


CATAATCTAA 


ATCTAATACC 


TGCACAATCC 


TTT 


9713 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8657 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

/ 

AAAGAAATTG TCAGAGAGTG GCTAGATGAA GTAGCAGAGC GGGCTAAGGA CTATCCAGAG 60 

TGGGTGGATG TTTTCGAGCG TTGCTACACC GATACCTTGG ACAATACGGT TGAAATCTTA 120 

GAAGATGGTT CAACTTTTGT CTTGACTGGG GATATTCCTG CCATGTGGCT TCGAGATTCG 180 

ACAGCCCAAC TCAGACCCTA CCTTCATGTA GCTAAAAGAG ATGCCCTCCT GCGTCAGACC 240 

ATTGCAGGTT TGGTCAAACG TCAGATGACC TTGGTACTCA AGGATCCCTA TGCTAACTCC 300 
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1 i LAALA 1 Xu 


AGGAGAACTG 


GAAAGGGCAC 


CACGAGACTG 


ACCACACAGA 


CCTTAACGGC 


360 


1 GGATCTGGG 


AGCGCAAGTA 


TGAGGTGGAT 


TCGCTTTGCT 


ATCCTTTGCA 


GTTGGCTTAT 


420 


C TCCTC TGGA 


AAGAGACTGG CGAGACTAGT 


CAGTTTGATG 


AGATTTTTGT 


CGCAGCGACT 


480 


AAGGAAATTC 


TCCATCTGTG 


GACGGTGGAA 


CAAGACCACA 


AGAACTCTCC 


TTATCGTTTT 


540 


GTCCGAGATA 


CGGACCGTAA 


GGAAGACACC 


TTGGTAAATG 


ATGGCTTTGG 


ACCTGACTTT 


600 


GCAGTGACAG 


GTATGACTTG 


GTCAGCTTTT 


CGTCCGAGTG 


ATGACTGTTG 


CCAGTATAGT 


660 


TACTTGATTC 


CGTCAAATAT 


v x x x vjx_ i^i f\ 


GTAGTCTTGG 


GTTATGTGCA 


AGAAATCTTC 


720 


GCAGCATTAA 


A PPT Af3P*TY2 A 

rVV.\. X AU\_ 1 VIA 




GTTATTGCTG 


ATGCCAAGCG 


TCTTCAGGAT 


780 


GAAATCCAAG 


A A ATP A A 


a a aptapppt 

AnAL X A(_vj v. 1 


TACACCACCA 


ACAGCAAGGG 


CGAAAAGATT 


640 


TACGCTTTTG 


A A (WCZTZ A TV3f2 


rpTirr 1 a a at 


GCCAGCATCA 


TGGATGATCC 


AAATGTACCA 


900 


AGTCTACTAG 


V» 1 v)V»V3t_v*V X A 


Tp*iY2rir;pT ap 


TGTTCGGTCG 


ATGATGAAGT 


GTATCAAGCT 


960 


ACTCGTCGTA 


PPATTTTYSAP, 


PTPTP.A A A AT 


CCATACTTCT 


ACCAAGGAGA 


ATACGCAAGC 


1020 


GGTCTCGGCA 


PjTT PTP AT A P 




TATATCTGGC 


CAATCGCCCT 


TTCTATCCAA 


1080 


GGCTTGACAA 


PA Jifl AH AT" A A 


p-p.pAP.ma a a 


AAATTCTTGG 


TGGATCAGCT 


GGTTGCCTGC 


1140 


GATGGTGGTA 


CAGGTfiTPAT 


P.PAPP.R A AP.P 


TTTCATGTAG 


ATGATCCGAC 


CCTCTACTCT 


1200 


CGTGAATGGT 


tptvptyw^p 

X V* X V— V— X xjVjVjv^ 


X /v\v* A X V_J^\ A \J 


TTCTGTGAGT 


TGGTCTTGGA 


TTACTTGGAT 


1260 


ATTCGCTAAG 


G GG CTCGCTT 


TAGCTCAACP. 


GATTCTTATC 


AGAATCACAA 


GTTTACATTT 


1320 


AAAACGTTAA 


AATTTAAATT 


TAGAATGAGG 


TTTTACTTCA 


TGGAAAATGT 


TGTTGTACAT 


1380 


ATTATCTCAC 


ATAGTCACTG 


GGATCGTGAG 


TGGTACTTGC 


CTTTTGAAAG 


CCATCGTATG 


1440 


CAGTTGGTGG 


AATTGTTTGA 


CAATCTCTTT 


GATCTCTTTG 


AAAATGACCC 


TGAGTTCAAG 


1500 


AGTTTCCACT 


TGGATGGACA 


AACTATTGTC 


CTTGATGACT 


ACTTACAAAT 


TCGCCCTGAA 


1560 


AATCGCGACA 


AGGTCCAACG 


CTACATTGAC 


GAGGGCAAAC 


TTAAAATTGG 


TCCCTTTTAC 


1620 


ATCTTGCAGG 


ATGACTACTT 


GATCTCCAGT 


GAAGCCAATG 


TCCGCAATAC 


CTTGATTGGT 


1680 


CAACAAGAAG 


CTGCCAAATG 


GGGTAAATCA 


ACCCAGATTG 


GCTACTTTCC 


AGATACCTTT 


1740 


GGAAATATGG 


GACAAGCGCC 


TCAAATTCTT 


CAAAAATCAG 


GCATTCACGT 


GGCGGCCTTT 


1800 


GGTCGTGGTG 


TGAAGCCGAT 


TGGATTTGAC 


AACCAAGTCC 


TTGAAGATGA 


GCAGTTTACG 


1860 


TCTCAGTTTT 


CAGAAATGTA 


CTGGCAGGGT 


GTGGATGGTA 


GTCGTGTTTT 


AGGTATTCTC 


1920 


TTTGCCAACT 


GGTACAGTAA 


CGGGAATGAA 


ATTCCAGTTG 


ACAAAGATGA 


GGCCTTGACC 


1980 


TTCTGGAAAC 


AAAAATTGTC 


AGATGTGCGT 


GCCTACGCTT 


CGACCAACCA 


ATGGTTGATG 


2040 


ATGAACGGCT 


GTGACCACCA 


GCCTGTACAG 


AAAAATCTGA 


GCGAAGCCAT 


TCGTGTGGCA 


2100 
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AATGAACTCT 


TCCCGGATGT 


AATCTTTGTT 


CATAGTTCTT 


TTGATGAATA 


TGTTCAAGCT 


2160 


GTAGAAGGTG 


CGCTTCCTGA 


ACACTTATCA 


ACTGTTACAG 


GCGAGTTGAC 


CAGTCAGGAA 


2220 


ACAGATGGCT 


GGTACACACT 


TGCCAACACT 


TCTTCATCCC 


GCATTTACCT 


AAAACAAGCC 


2280 


TTCCAAGAAA 


ATAGCAACCT 


CCTAGAGCAA 


GTGGTAGAAC 


CCTTGACTAT 


TATCACTGGT 


2340 


GGACACAACC 


ACAAGGACCA 


GTTGACCTAT 


GCTTGGAAAA CACTTTTGCA GAATGCGCCA 


2400 


CATGATAGTA 


TCTGTGGCTG 


TAGCGTGGAC 


GAAGTTCACC 


GCGAGATGGA 


AACGCGTTTT 


2460 


GCCAAGGTCA ACCAAGTAGG AAACTTTGTT AAAAGTAACT TGCTCAAGGA GTGGAAGGGT 


2520 


AAAATTGCTA 


CGGATAAGGC 


TCAAAGTGAC 


TATCTCTTTA 


CTGTCATTAA 


CACAGGCTTG 


2580 


CATGATAAGG 


TCGATACTGT 


CAGCACAGTG 


ATTGATGTGG 


CGACTTGTGA 


TTTCAAGGAA 


2640 


TTGCACCCAA 


CAGAAGGCTA 


CAAAAAGATG 


GCTGCTCTTA 


TCTTGCCAAG 


TTACCGTGTG 


2700 


GAGGACTTGG 


ATGGTCGTCC 


TGTAGAGGCT 


ACAATCGAAG 


ACCTCGGAGC 


TAATTTTGAG 


2760 


TATAATTTAC 


CAAAAGACAA 


GTTCCGCCAA 


GCTCGTATTG 


CTCGTCAAGT 


GCGCGTGACC 


2820 


ATTCCAGTTC 


ACCTAGCGCC 


GCTTTCTTGG 


ACAACCTTCC 


AATTGCTGGA 


AGGAAAACAA 


2880 


GAACACCGTG 


AGGGTATTTA 


CCAAAACGGA 


GTGATTGATA 


CACCATTCGT 


AACGGTGAGT 


2940 


GTGGATGACA 


ACATCACAGT 


CTATGACAAG 


ACAACTCACG 


AAGCCTATGA 


AGACTTTATC 


3000 


CGCTTTGAAG 


ACCGTGGGGA 


CATCGGAAAC 


GAGTATATCT 


ATTTCCAACC 


AAAAGGAACA 


3060 


GAGCCAATCT 


TTGCAGAGCT 


TAAGGGCCAC 


GAGGTCTTGG 


AAAACACAGC 


TTGCTATGCT 


3120 


AAAATCTTGC 


TCAAACATGA 


ATTGACCGTG 


CCTGTCAGTG 


CGGATGAAAA 


GCTAGAAGAA 


3180 


GAGCAACAAG 


GTATCATCGA 


GTTTATGAAG 


CGTGAGGCTG 


GACGGTCAGA 


AGAATTGACA 


3240 


AACATTCCTC 


TGGAAACTGA 


GTTGACTGTC 


TTCGTTGACA 


ATCCACAAAT 


CCGCTTCAAG 


3300 


ACTCGCTTTA 


CTAACACTGC 


CAAGGATCAC 


CGTATCCGTC 


TCTTGGTCAA 


GACTCATAAC 
* 


3360 


ACGCGTCCAA GCAATGATTC 


TGAAAGTATC 


TATGAGGTGG 


TGACACGAGG 


AAACAAACCA 


3420 


GCTGCTTCAT 


GGGAAAACCC 


TGAAAATCCT 


CAACACCAAC 


AAGCTTTTGT 


CAGTCTGTAT 


3480 


GACGATGAAA AAGGGGTGAC TGTATCCAAC AAGGGATTGA ATGAATACGA AATCCTTGGG 


3540 


GATAACACCA 


TTGCCGTGAC 


CATTTTGCGT 


GCATCAGGTG 


AGCTAGGTGA 


CTGGGGCTAC 


3600 


TTCCCAACGC 


CAGAAGCACA 


ATGCTTGCGG 


GAGTTTGAAG 


TCGAGTTTGC 


ACTTGAATGC 


3660 


CACCAAGCCC 


AAGAACGCTT 


CTCAGCCTAT 


CGTCGTGCCA 


AAGCCTTGCA 


GACACCGTTT 


3720 


ACCAGCGTTC 


AGCTTGCTAG 


ACAGGAAGGA AGCGTGGTTG 


CGACTGGTAG 


CCTCTTGAGC 


3780 


CATTCTGTTC 


TCAGCATACC 


GCAAGTTTGT 


CCAACAGCCT 


TTAAGGTAGG 


TGAAAATGAA 


3840 
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WvWjW. 1 A 1\j 


TGCTTCGTTA 


CTACAATATG 


TGTAGTGAAA 


ATGTACGTGT 


GCCAGAAAGT 


3900 


C & 11 A I'P'PP'I* 


TCCTTGACCT 


ACTTGAACGA 


CCATACCCAG 


TTCATTCAGG 


ACTATTGGCT 


3960 


*-LAt-AAGAGA 


TTCGTACAGA 


ATTCATCAAA 


AAAGAAGAAA 


TTTAATTTCA AAAAGTAAAC 


4020 


Ai'uAAAAGAA 


AGGAGGGGCG 


AAAAAGTAAG 


AACTAACTGC 


TGATTCGCCC 


CTTTTATGGT 


4060 


AAAAACAATG 


AGCATTGCAA 


CGATTGATAT 


CGGAGGGACT 


GGGATTAAGT 


TTGCCAGTCT 


4140 


GACTCCTGAT 


GGGAAAATAC TGGATAAGAC AAGTATTTCA ACGCCTGAAA ACTTGGAGGA 


4200 


TTTACTAGCG 


TGGCTAGATC 


AACGCTTGTC 


AGAACAGGAT 


TACAGTGGGA 


TTGCTATGAG 


4260 


CGTTCCAGGT 


GCAGTCAATC 


AAGAGACAGG 


TGTGATTGAT 


GGCTTCAGTG 


CGGTGCCCTA 


4320 


CATCCATGGC 


TTTTCTTGGT 


ATGAGGCGCT 


TAGCTGTTAT 


CAGCTACCTG 


TCCATTTAGA 


4380 


AAATGATGCC 


AACTGCGTTG 


GACTCAGTGA 


ACT ACT AG CT 


CATCCAGAGC 


TTGAAAATGC 


4440 


AGCCTGTGTC 


GTGATTGGGA 


CAGGGATTGG 


CGGAGCCATG 


ATTATCAATG 


GTAGACTTCA 


4500 


TCGAGGTCGC 


CACGGTCTGG 


GTGGAGAATT 


TGGCTACATG 


ACAACCCTTG 


CCCCTGCTGA 


4560 


AAAACTTAAT 


AACTGGTCGC AACTAGCATC AACTGGGAAT ATGGTACGAT ACGTGATTGA 


4620 


AAAATCTGGT 


CATACTGATT 


GGGACGGTCG 


CAAGATTTAC 


CAAGAGGCCG 


CAGCTGGTAA 


4680 


TATCCTTTGT 


CAAGAAGCCA 


TTGAGCGCAT 


GAACCGCAAT 


CTGGCGCAAG 


GCTTGCTCAA 


4740 


TATCCAGTAT 


CTGATCGATC 


CAGGTGTCAT 


CAGTCTGGGT 


GGCTCTATCA 


GTCAAAATCC 


4800 


AGATTTTATC 


CAAGGTGTCA 


AGAAGGCTGT 


TGAAGACTTT 


GTCGATGCCT 


ACGAAGAATA 


4860 


CACGGTCGCA 


CCAGTTATCC 


AGGCCTGCAC 


CTATCACGCA 


GATGCCAATC 


TCTACGGTGC 


4920 


TCTTGTCAAC TGGCTACAGG AGGAAAAGCA ATGGTAAGAT TTACAGGACT TAGTCTCAAA 


4980 


CAAACGCAAG 


CTATTGAGGT 


TTTAAAAGGT 


CACATTTCTC 


TACCAGATGT 


GGAAGTGGCT 


5040 


GTCACTCAGT 


CTGACCAAGC 


ATCTATCTCT 


ATCGAGGGTG 


AGGAAGGTCA 


CTATCAATTG 


5100 


ACCTACCGCA 


AACCTCACCA 


ACTTTATCGT 


GCCTTGTCCT 


TGTTGGTAAC 


AGTTCTAGCA 


5160 


GAAGCTGATA 


AAGTAGAGAT 


TGAGGAACAA 


GCAGCTTACG 


AAGATTTGGC 


TTACATGGTT 


5220 


GACTGTTCTC 


GAAATGCGGT 


GCTGAATGTG 


GCTTCTGCCA 


AGCAGATGAT 


TGAGATATTG 


5280 


GCTCTCATGG 


GCTACTCAAC 


CTTTGAGCTT 


TACATGGAAG 


ACACTTACCA 


GATTGAAGGG 


5340 


CAGCCTTACT 


TTGGCTATTT 


CCGTGGAGCT 


TATTCAGCAG 


AGGAGTTGCA 


GGAAATCGAA 


5400 


GCCTATGCCC 


AACAGTTTGA 


CGTGACCTTT 


GTACCATGCA 


TCCAGACCTT 


GGCCCACTTG 


5460 


TCGGCCTTTG 


TCAAATGGGG 


TGTCAAGGAA 


GTGCAGGAGC 


TCCGTGATGT 


AGAGGACATT 


5520 


CTTCTCATTG 


GCGAAGAAAA 


GGTTTATGAC 


TTGATTGATG 


GCATGTTTGC 


CACGTTGTCT 


5580 


AAACTGAAGA 


CTCGCAAGGT 


CAATATCGGG 


ATGGACGAAG 


CCCACTTGGT 


TGGTTTGGGA 


5640 
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CGCTACCTGA TTCTGAACGG TGTTGTGGAT CGTAGTCTCC TCATGTGCCA ACACTTGGAG 5700 

CGCGTGCTGG ATATTGCTGA CAAATATGGT TTCCACTGCC AGATGTGGAG TGATATGTTC 5760 

TTCAAACTCA TGTCAGCGGA TGGCCAGTAC GACCGTGATG TGGAAATTCC AGAGGAAACT 5820 

CGTGTCTACC TAGACCGTCT CAAAGACCGT GTGACTCTGG TTTACTGGGA TTATTATCAG 5880 

GATAGCGAGG AAAAATACAA CCGTAATTTC CGCAATCATC ACAAGATTAG CCATGACCTT 5940 

GCATTTGCAG GGGGAGCTTG GAAGTGGATT GGCTTTACAC CTCACAACCA TTTTAGCCGT 6000 

CTAGTGGCTA TCGAGGCTAA TAAAGCCTGC CGTGCCAATC AGATTAAAGA AGTCATCGTA 6060 

ACGGGTTGGG GAGACAATGG TGGTGAAACT GCCCAGTTCT CTATCCTACC AAGCTTGCAA 6120 

ATCTGGGCAG AACTCAGCTA TCGCAATGAC CTAGATGGTT TGTCTGCGCA CTTCAAGACC 6180 

AATACTGGTC TAACGGTTGA GGATTTTATG CAGATTGACC TTGCCAACCT CTTACCAGAC 6240 

CTACCAGGCA ATCTCAGCGG TATCAATCCC AACCGCTATG TTTTTTATCA GGATATTCTT 6300 

TGTCCGATTC TTGATCAACA CATGACACCT GAACAGGACA AACCGCACTT CGCTCAGGCT 6360 

GCTGAGACGC TTGCTAACAT TAAAGAAAAA GCTGGAAACT ATGCCTATCT CTTTGAAACT 6420 

CAGGCCCAGT TGAATGCTAT TTTAAGTAGC AAAGTAGATG TGGGACGACG CATTCGTCAG 6480 

GCCTACCAAG CGGATGATAA AGAAAGTTTA CAACAAATCG CCAGACAAGA ATTACCAGAA 6540 

CTTAGAAGCC AAATTGAAGA CTTCCATGCC CTCTTTAGCC ACCAATGGCT GAAAGAAAAC 6600 

AAGGTCTTTG GTTTGGATAC AGTTGACATC CGTATGGGCG GACTCTTGCA ACGCATCAAA 6660 

. CGAGCAGAAA GCCGTATCGA GGTTTATCTG GCTGGTCAGC TTGACCGCAT CGACGAGCTG 6720 

GAAGTTGAAA TCCTACCATT TACTGACTTC TACGCAGACA AGGATTTCGC AGCAACTACA 6780 

GCCAACCAGT GGCATACCAT TGCGACAGCG TCGACGATTT ATACGACTTA ATATTCTTCG 6840 

AAAATCTCTT CAAACCACGT CAGCTTCCAT CTGCAACCTC AAAACAGTGT TTTGAGCAAC 6900 

CTGCAGCTAG CTTCCTAGTT TGCTCTTTGA TTTTCATTGA GTATAAAAAC AAGAACACCT 6960 

TGCTTGGCGC AGGGTGTTTC GCGTGAAACA GAAGAATTAT CTGGTTTCAA ATGCTACAGT 7020 

TAGACAAACT TATGATAAAA TAGCAGAAAG TGAATGTTTC CTAAGAGCAA TTGGAGGTAT 7080 

TATGCTACAC TTAAAATTAG TAAAACAAGA AATAGAAGCT GAAAAGCCAG CATCTGTAGA 7140 

AGCTTGGATC ATTTCCGTCA AATTTAAAAA AGGTTGCTAC CGACATATAT AGATTCCAAA 7200 

AACAAAAACG TTAGCGGAAC TAGCAGATGT GATTTTATGG AGTTTTGATT TTGCAAATGA 7260 

TCATGCTCAC GCATTTTTCA TGGATAATGT TGAGTGGAGT CATGCAGATT CTTACTTTCG 7320 

TAGCTTTGTT AGTGACGATG TTGAAGAACG TTACACAGAA AATGTCTATC TGGATAGCCT 73 80 



WO 98/18931 



PCT/US97/19588 



416 

AAGTGTCAAA CAAAAATTTA AGTTTATTTT CGACTTCGGT GATGAATGGC GTTTTGAATG 7440 

CCAAGTGCTG AGAGAAATCG AGACAGAGGA CGAAGAAGCT TATCTCGTAC GTTCGGTTGG 7500 

AACGTCGCCA GAACAATATC CAGATTATGA TGGTTTTGAC TATGAAGAAT GGTAAAATTG 7560 

AAATCAGTCT GTGTAGGCTT AGTATTTCAA TAGACTTCCT GCAAAACTAG AATCCTAGTT 7620 

CATGATTGAT AATACCAGCA ATCAAATTCA TTCGTAATCC GAAGCGTTTA CGATGATTTC 7680 

GATAGGTTGT TGAAAACATT TTAAACGTTT TTACTTTGGC AAAGATGTTC TCAACCTTGC 7740 

TTCTCTCCTT AGATAGCGCA TGGTTATAGG CTTTATCTTC AGCTGTTAGT GGCTTGAGTT 7800 

TGCTGGATTT ACGTGAAGTT TGTGCTTGAG GACATATCTT CATGAGCCCT TGATAACCAC 7860 

TGTCAGCCAA GATTTTACCA GCTTGTCCGA TATTTCTGCA ACTCATTTTG AACAACTTCA 7920 

TATCATGACA ATAGTTCACA GTGATATCCA AAGAAACAAT TCTCCCTTGA CTTGTGACAA 7980 

TCGCTTGAGC CTTCATAGCG TGAAATTTCT TTTTACCAGA ATCATTCGCT AATTCTTTTT 8040 

TTAGGGCGAT TGATTTTTAC TTCCGTCGCA TCAATCATTA CCGTGTCCTC AGAACTAAGA 8100 

GGAGTTCTTG AAATCGTAAC ACCACTTTGA ACAAGAGTTA CTTCAACCCA TTGGCTCCGA 8160 

CGGATTAAGT TGCTTTCGTG AATACCAAAA TCAGCCGCAA TTTCTTCATA AGTGCGGTAT 8220 

TCTAGGCTTA ATTTAGGTTT TCGTCCACCT TTTGCGTGTT TAAGTTGATA AGCTGTTTTT 8280 

AATACAGCTA ACATCTCTTT AAAAGTCGTG CGCTGAACAC CAACAAGACG CTTAAATCGT 8340 

GTATCAGTTA ATTGTTTACT TGCTTCATAA TTTCGCAGGG AGTCTATTGA CTCTTTGGTA 8400 

GGTGTCAATG TTTTTTTCAT CTATCCCGAG AATTATTTTC CCGCCATTTG TATTTGCAAA 8460 

TGCTGAGTAG GTTTCCCAGA AAGACTCTGG AAGATTGTTT TTAGCTTTTT TGTATTCTAA 8520 

ATCAACCCCT TCAAATTTTA AGTCCATATT TTTCCTTTAC ATCTGTTTTT TGTGGTTCTG 8580 

GTATTTGTTC AAGTTGAGTG ATAATATAGC GAATTGAATT TCGAGAGTTT TTACTCAGTT 8640 

AATTTCTTTT TTAACCC 8657 
<2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11384 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

TCTATTTTGG GTATAGACTT ACCTATAAAG AAAAATATCT ATACACTGCC TTACTAGCTA 60 

TACTGAACGA GTCAACAAAA ACGATATATA TTGATGATAT AAATACAGCA AGATTTTTTA 120 
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ACTTCTTTGG CAATGATATT CCTAATTCGT CTTTAAAAAA AATTGACTAT ATCGCACCTT 180 

CAGAAATTGT TTCATTTAGT ACGTACGTTC GACAACGTTC TAAAGTAATT CCTAAAATTT 240 

TGGAACATAT ATTAAAATCA AGTTTTTTAT TAGAGAATAT AGATGTTTCT GGTTACACTG 300 

TAAATATTTT AGAAGATCAA TTAACAAAAC ATAGAACAAT CAAAATTAGT AAAAACTAAC 360 

TGGTTGATCT CATGTATAAA TACCTAACAA AACCACGCGC CTTGCCTGCT GATGGAAAGA 420 

AAGGTACAAA TACATGAATA TCAAAGAAAA AATCAAAAAG AATGGCCAAA GAGTTTATTA 480 

TGCTAGTGTT TATCTAGGCG TTGACCAACT AACGGGCAAA AAAGCCCGTA CAACTGTTAC 540 

AGCAACCACT AAAAAGGGCG TTAAAGTAAA AGCGCGTGAT GCGATCAATA CTTTTGCTGC 600 

TAATGGCTAT ACAGTTAAAG ACAAGCCGAC AATTACAACA TATAATGAGC TTGTAAAAGT 660 

TTGGTGGGAT AGTTACAAGA ATACAGTTAA GCCAAATACT CGCCAATCCA TGGAGGGATT 720 

GGTTAGAGTG CATTTATTGC CTGTATTTGG GGATTACAAG CTATCTAAAC TTACTACGCC 780 

TATTCTTCAA CAGCAAGTAA ACAAATGGGC TGACAAGGCA AATAAAGGCG AAAAAGGGGG 840 

ATTTGCTAAC TACTCTTTGC TCCATAACAT GAATAAGGGT ATTTTGAAAT ATGGCGTAGC 900 

TATCCAGGTA ATACAATACA ACCCAGCTAA TGATGTCATC GTTCCACGCA AACAGCAAAA 960 

AGAAAAGGCT GCTGTCAAAT ACTTAGACAA CAAAGAATTA AAACAGTTTC TTGATTATTT 1020 

AGATGCTCTG GATCAATCAA ATTATGAGAA CTTATTTGAT GTTGTTCTGT ATAAGACTTT 1080 

ATTGGCCACT GGTTGCCGTA TTAGTGAGGC TCTGGCTCTT GAATGGTCTG ATATTGACCT 1140 

AGAAAGCGGT GTTATCAGCA TCAATAAGAC ACTAAACCGC TATCAGGAAA TAAACTCACC 1200 

TAAATCAAGC GCTGGTTATC GTGATATACC AATAGACAAA GCCACATTAC TTTTACTGAA 1260 

ACAATACAAA AACCGTCAAC AAATTCAGTC TTGGAAATTA GGCGGATCTG AAACAGTTGT 1320 

ATTCTCTGTA TTTACGGAGA AATATGCTTA TGCTTGTAAC TTACGCAAAC GCCTAAATAA 1380 

GCATTTTGAT GCTGCTGGAG TAACTAACGT ATCATTTCAT GGTTTCCGCC ATACACATAC 1440 

TACTATGATG GTCTATGCTC AGGTTAGCCC GAAAGATGTT CAGTATAGAT TAGGCCACTC 1500 

TAATTTAATG ATCACTGAAA ATACTTACTG GCATACTAAC CAAGAGAATG CAAAAAAAGC 1560 

CGTCTCAAAT TATGAAACAG CTATCAACAA TTTATAAAAA ATAAGGGTGA CCCATTTCCG 1620 

GGCTACCCTC TTACTATACC AAAAATTAGT AGGGGTAGTA AAAAGGGTAT TAAATTATAA 1680 

AAAGCACTAA GGGAAAGCGC CCCAAAGTGC TTATTTCAAA GGCTTTATAG CCTATAATCA 1740 

CATAAAGAGA TTATTTTTTA AGGTTGTAGA ATGATTTCAA TCGACGATAT TCAGCTACTT 1800 

CACCAAGTTG GTCTTCGATA CGAAGCAATT GGTTGTATTT AGCGATGCGG TCTGTACGTG I860 
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AAAGTGAACC AGTCTTGATT TGTCCTGCGT TAGTTGCAAC TGCAATATCA GCGATTGTTG 1920 

AATCTTCAGT TTCACCTGAA CGGTGTGATA CAACAGCAGT GTAACCAGCT TCTTTAGCCA 1980 

TTTCGATAGC TTCAAAAGTT TCAGTAAGAG TACCGATTTG GTTAACTTTG ATAAGGATTG 2040 

AGTTAGCAGC ACCTTCTTGG ATACCACGTG CAAGGTAGTC AGTGTTTGTT ACGAAGAAGT 2100 

CGTCACCAAC AAGTTGTACT TTCTTACCAA GACGTTCAGT AAGAGCTTTC CAACCATCCC 2160 

AGTCGTTTTC ATCCATACCA TCTTCAATAG TGATGATTGG GTATTTGTTA ACCAATTCTT 2220 

CAAGGTAGTC GATTTGTTCT GCAGATGTAC GAACAGCAGC ACCTTCACCT TCAAATTTAG 2280 

TGTAGTCGTA AACTTTACGT TCTTTATCGT AGAATTCTGA TGAAGCACAG TCAAATCCGA 2340 

TAAATACGTC TTTACCTGGT ACATATCCAG CAGCTTCAAT CGCAGCAAGG ATAGTTTCAA 2400 

CACCATCTTC AGTTCCTTCG AAACGAGGAG CGAATCCACC TTCGTCACCT ACGGCAGTTT 24 60 

CCAAACCACG TGATTTAAGG ATTTTCTTAA GAGCGTGGAA GATTTCAGCA CCGTAACGAA 2520 

GGGCTTCTTT AAATGTTGGC GCACCAACTG GCAAGATCAT GAACTCTTGG AAAGCGATTG 2580 

GAGCGTCAGA GTGAGAACCA CCGTTGATGA TGTTCATCAT TGGAGTTGGA AGAACTTTAG 2640 

TGTTGAATCC ACCAAGATAG CTGTAAAGTG GGATTTCAAG GTAGTCAGCA GCAGCACGAG 2700 

CTACAGCGAT AGACACACCG AGGATTGCAT TCGCACCCAA TTTACCTTTG TTAGGAGTAC 2760 

CGTCAAGTGC GATCATAGCA CGGTCAATAG CTTGTTGATC ACGTACATCG TAGCCAATGA 2820 

TAGCTTCAGC AATGATGTTG TTTACGTTGT CAACAGCTTT TTGTGTACCA AGACCACCGT 2880 

AACGAGATTT GTCACCGTCG CGAAGTTCAA CTGCTTCGTG TTCACCAGTA GAAGCTCCTG 2940 

ATGGAACCAT ACCACGTCCG AAAGCACCTG ATTCAGTGTA AACTTCTACT TCAAGTGTTG 3000 

GGTTACCGCG TGAGTCTAGG ACTTCGCGAG CGTAAACATC AGTAATAATT GACATTTTTT 3060 

ACTCTCCTTA TGAGTTAAAT TTTTTACACC TCTATAATAC CTTAAAACCC CTCCTTTTTC 3120 

AAGAAAAAAC GTTATCTTTG TGCAACTTTT CCTTAACTTT ATAAAGTAAT CGCTTTCTTT 3180 

TGTCTGTTTT ATTCTAACTT TTATGATATA CTGTTTTCAT GACAGATTTA TCAAAACAAT 3240 

TACTTGAAAA AGCTCATGGT GGGTTAAAAA TAAATCCGGA TGAGCAAAGA CGCTATCTTG 3300 

GTACTTTTGA GGAAAGAGTT CTTGGATATG TAGATATTGA CACAGCAAAT AGCCCTCAGT 3360 

TAGAAAAAGG CTTTTTATTT ATTTTAGAAA ACCTTCAGGA AAAAGCAGAG CCACTATTTG 3420 

TGAAGATTTC ACCAACTATC GAATTTGATA AGCAAGTTTT CTACTTAAAA GAAGCAAAAG 3480 

AAACTGATAG TCAAGCCACC ATAGTATCTG AAGAGCATAT TACTTCTCCT TTTGGCCTGG 3540 

TTATTCATAG CAATGCACCA GTTCAAGTAG AAGAAAAAGA CCTTCGACTT GCTTTTCCAA 3600 

AACTTTGGGA AGTTAAAAAG GAAGAACCAG CCAAAACATC CTTATGGAAG AAATGGTTTA 3660 
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GCTAAATCTT 


GCACATATTT AATAAGTGCC CAATATTGGC AGCCGTGCGC 


TCCAGATAGA 


3720 


AACTGGCATT 


TTTCAAACTA 


TCTTCTAAAG 


GTTCACTTTT 


CTCCAAAATA 


GAAAAGACAG 


3780 


CTTGGATATT 


TTCAAATGGT 


AGGGGAGGTA 


AATCTTCAGC AAGACTACCG 


CAAATAGCAA 


3840 


TAACAGGAAC 


TCCAACAGGG 


GTTCTTTTTG 


CAACACCTAT 


AGGCGCTTTC 


CCAGCAAAGC 


3900 


TTTGACTATC 


AAGTCTTCCT 


TCTCCAACAA 


CAACCAAGTC 


AGCATCTGAA 


ACTTTCTTAT 


3960 


CAAAGTTGAT 


TAAGTCCAAG 


CAGGTATCAA 


TTCCAGACAC 


GATACTTGCC 


TGAGCAAAGG 


4020 


CACACAAACC 


ACCAGCAAGG 


CCTCCACCTG 


CTCCTGCTCC 


TTTAATTTCT 


AATGTTGCAG 


4080 


GTGAGAATTT 


TTCATAAAAA 


TCTTGGATCG 


CCTGATCTAC 


GACTGCAAAC 


ATAGTCGGAT 


4140 


GTAGACCTTT 


TTGATTGCCA 


AAAGTGTAAG 


TCGCACCTTG 


ATGACCACAT 


AAGGGACTCA 


4200 


CGACATCTGC 


TAAAATATGA 


ATTTGAACAC 


CTTCAGGAAT 


TTTATAGCAA 


TTTTCTGTTG 


4260 


AAACAGAAGC 


TAAGTTTAAT 


AAGGATTGAC 


CGGAAGCAGG 


CAAGACATTT 


CCATCCCTAT 


4320 


CATAAAATTG 


ATAACCTAAA 


CCAGCAGCAA 


TCCCCAGTCC 


TCCATCATTA 


CTGGCCGTGC 


4380. 


CACCAACACC 


GATATAAATA 


TCTTTAATCC 


CTTTAGAGAT 


GAGATGAAGA 


ATCAACTCTC 


4440 


CAATACCACA 


AGTTTGGATT 


TGAAGTGGAT 


TTCGTTTCTC 


TAGCGGAATT 


TTTCCAAGAC 


4500 


CAACCAAGTC 


AGCTACTTCA 


AATAGTGCCA 


GTTCCCCTTT 


TTGAAAATAG 


CGCATGGCTT 


4560 


CTTTTTGTCC 


AAAAGGGTCT 


GTCACTTGGA 


TCCATTTTTC 


TTTTAGGTCA 


AGAGAATGTC 


4620 


GGATAGCATC 


TACAGTACCT 


TCTCCCCCAT 


CACCAACAGG 


GCAGAGGAGA 


CATTCTACAT 


4680 


CTGCTATCGA 


TTGTTGGAAG 


CCTCTTTTTA 


TTGCTTCAGC 


TACCTGTTGA 


GCTGTCAAGG 


4740 


TTTCCTTAAA 


CGAATCCGGT 


GCAATTACAA 


TCTTCATATT 


TTCCCTCATT 


CTAAACAGTC 


4800 


AATCAAAGGG 


AGAACTTCTA 


AAAAATCCCT 


CTTGTCAACA 


TGATGTGGTA 


TTTCTTTTTT 


4860 


GAGCACTTCT 


TTGGCACAAA 


AGGCGATTCC 


TAACTTCGCC 


GACTTCAACA 


TTAATAGATT 


4920 


ATTAACCCGA 


TCACCGATTG 


CCACCGTTCT 


TTCTTTAGAA AGTTTTAGTT 


TCTTTCTCCA 


4960 


TTTTTCCAGA 


GTCTCTTTTT 


TGACCTGGGG 


ACTTATAATT 


TGTCCAACTA 


ATTTTCCTGT 


5040 


TAAAAGACCT 


TCTTTGACTT 


CAAGCTAGTT 


GGCAGTGAAA 


TAGGCAATAC 


CAAGGGATTT 


5100 


TGCTAATCTC 


TCCAACTATT 


GGTGTAAATC 


CACCAGACAC 


CAGACCAACT 


AGGATGCCAT 


5160 


TCTTTTGGAG 


AATAGAGATG 


AACTCTGGGA CATTTAGCGA TAGATGAATT 


GAGTTGAAGA 


5220 


CGTTATCAAA 


GACCAAAATA 


GGAAGACCTT 


CCAACAAGGA 


CACTCTTTTT 


CTTAAACTGC 


5280 


TTTCAAAGAC 


CAACTCTCCT 


CGCATTGCTC 


GACTTGTAAT 


CTGCGAAATT 


TCCGCCTCAT 


5340 


GACCTGCCTC 


TCTCCCTAAA 


AGATCAATCA 


CTTCTTCTAG 


GATTAAGGTT 


CCATCTAGAT 


5400 
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CCAAAACACA CAAGCCTTTT ACTTGAGACA TCAGTTCTCC TCTCTAAACA GCCTAAAAAT 5460 

CGTATGAAGT CATCATACGA TTTTATCTAT TAATTAACTA AACTATGGTA CAAGTCAAGG 5520 

TATGACTTGC AGGCTGTATC CCATGAGAAG TCACTCTCCA TAGCTTGTTT TTGTAGGTTT 5580 

CTCCAAATGT CTGGATGGTT TCTATACAAG TCCAATGCTG TTTGGAAAGT CCAATTTAAC 5640 

CAATAAGGAG ATAGATTGTC AAAGCTAAAG CCAGTACCGC TTCCTTCGAT TGGATTGAAA 5700 

GCGCGAACTG TATCTCGCAA GCCTCCAACT TCATGGACCA ATGGCAAGGT TCCATAACGC 5760 

ATAGCCATCA TTTGAGACAA GCCACACGGT TCAAAACGAC TTGGCATGAG GAAGAGGTCA 5820 

CAAGCAGCGT AGATTTCCTG AGCAAGTTTG ACATCAAAAG TGATATTTGT TGATAGCTTG 5880 

TCTGGGTAAA TCTGAGCAAA CCATGAGAAA GCTCCTTCAA AGGCTGGATC GCCAGTTCCC 5940 

AAAAGAACAA TCTGAACATC TTCTTGCAAG ATATGGTGAA GACTTTCGAC CACCACATCA 6000 

AAACCTTTTT GACGTGTCAA ACGAGAAACA ATTCCCACCA GTGGAACGTC TGCTCTAACA 6060 

GGCAAGCCAA CTCTTTCTTG CAATTTTGCC TTATTTTTGG CTTTCCCAGA CAAATCTTCC 6120 

TGATTGAAAT GATAGTCTAA AAGAGCATCC GTCTGAGGAT, TATAAAGATC AGCATCAATC 6180 

CCATTCACGA TACCAGATAC TTTACCAGAC TCCATTTTAA GAATCTGATC CAAATTACAT 6240 

CCAAACTGAC TAGTCATAAT TTCATGAGCA TAGCTAGGTG AAACGGTTGA AACACGGTTC 6300 

GCATAGAGAA TACCTGCCTT CATCCAGTTC AGACAGTTGT TCCATCGAAG GGTGCCATCA 6360 

GCGTAACGTT CAAAGCCAAC TCCAAACAAA TCACCCAACA TTCCTTCTGA AAATTGTCCT 6420 

TGGAATTCTA AATTATGAAT GGTTAAAACT GTTTCAATGT CCTCATAGGC TTGAATCCAA 6480 

CGGTATTTTT CCTTCAACAA GAAAGGAATC ATAGCTGTAT GGTAGTCATG AACATGGAGA 6540 

AGATCAGGAA TAAAGTCAAT CCTTTCCATA GCCTCAATGG CAGCCAGTTG GAAAAAGGCA 6600 

AAGCGTTCTC CGTCATCAAA ATCACCGTAA ACATGACCAC GGAAGAAATA ATATTGATTG 6660 

TCAATAAAGT AGAAGGTTAC ACCATTTAAT ACTGTTTTCT TAATTCCACA ATACTGTCTG 6720 

CGCCAACCAA CGCTCACCTC AAAATGAAGC ACATCTTCAA TCTGATTTCC AAATTTAGCC 6780 

TCTACCATAT CATAGTAGGG TAAAATCACT GCAACTTCGT GCCCAGCTTT TACCAGTGAT 6840 

TTTGGAAGAG CGCCAATGAC GTCTCCCAAA CCACCTGTTT TTGAAAAGGG TGCACCCTCT 6900 

GCTGCTACAA ATAAAATTTT CATGAATGAA TATCCTCTGT TACTTTAGCA CCTTTCTTAA 6960 

CCACAACTGG ATGTTCTGCA GTTCCTCGAA TCACAACACC ATGCTCAACT TCAACCCCTT 7020 

TGTCCAAGAT AGCATATTCG ACCTGAGCCC CTTCTCCAAT AACAACACGA GGGAAGAGCA 7080 

GGCTATCTTT AACCAAGCTA TCCTTATGGA CATGAATATT ACGTGATAGA ACAGAATTAG 7140 

CTACTTGACC TTCAATAATA CTACCAGAGG CAAACTGAGA AGTGCTTACC TTAGATGTAT 7200 
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TAGCATAGTA AGTTGGCTCT TCGTTTTTGA CCTTTGTATA AATCTTTTGG TTTGGTGAGA 7260 

AAAGAGAATA GAATTTTTGT GATTCAAGCA TATCGATATT CGCTTGATAA TAAGATTTAA 7320 

CAGAGTGAAT ATTGGCTAGA TAGCCCGTGT ACTCGTAGGC GAAAGCTCCC TCTTTTACAG 7380 

CCAAATCCCG TAAAACATAG CGCAATTTCT CTGGATGTTC TTTTTTAGCT TCTTCTTCCA 7440 

AGTGTTCAAT CAACCAAGGT GTATCAACGA CAAAGATATC TGTAGACATA TTGAACGTTT 7500 

CAGCTGTTGA CTTGCTATCA AAGAGTTTAT GAGAAAGAAC ATGGTCTGTT TCATCTACAT 7560 

CCAAGATTGC ATTTACTTCT GAAATATCTT TCTTAGCTAG TTTTTTATAA ACTACAGTGA 7620 

TAGGCTCTTT TGTTGTACTA TGTAGGTGGA AAACTTGGTT CAAATCAATG TTAATAAGAA 7680 

CATCGCAGTT GAGGGCAACC GTTTGGTTTG AGCCAGAACG TTTCAAATAA GTAAGAAGCT 7740 

GTTGGTAGTA TTCTTTTCCA ACTGTACTAC TTTCTACACG GGTATTGTAA ATTCCTAGAT 7800 

AGTAATGGCT AAGAAGGGTT GATAAGCCCC ACTCGCGTCC TGAACGAATA TGGTCAAATA 7860 

CTGAGCTGAT ATTATCCTGC TGGAAAATAC CAAAGACACT ACGAACACCT GCATTAGCAA 7920 

GGCTTGAAAG TGGGAAGTCA ATCAAACGAT ATTTCCCACC AAATGGCAAA CTTGCTACTG 7980 

GACGGTGGTC CGTCAATGTC GACATATTGT GAAAACCAAC TGTATTTCCT AAAATGGCAG 8040 

AATATTTATC AATCTTCATC TGTTGCTACC CCCACTACTT CATTATATCC TACAACTTGT 8100 

ACTTCATCTG TTCCATCAAT. TTCGACACCG TCAGAAATAA TCGCACCTTC ACCAATAATG 8160 

GCACGTTTAA TCTTAGCTCC TTGACCAATG ATAGCTCCAC TCATGATAAC TGAATCAAGG 8220 

ACTTCCGCTC CTTCGCGAAC TTGCGCGCCT GTTGAAAGGA TAGAATGTTT AACAGTTCCA 8280 

TCAACGAAAC ATCCGTCTAC AACTAATGAG TCTTCCACAT GAGCATTTGC CCCGAGGAAG 8340 

TTTGGTGGTG AAATCAAGTT TCTTGAGTAA ATCTTCCATT GACGGTTACG ACTATCCAAG 8400 

GCATTTTCTG GAGAAATATA CTCCATGTTC GCTTCCCAAA GTGACTCAAT AGTACCAACA 8460 

TCTTTCCAAT AACCACTAAA TTCGTAAGCA TAAACACTTT CACCTGACTC AAGGTAATTT 8520 

GGAATGACAT TTTTACCAAA GTCTGACATG CCAACCTTGC TCTTTTCAGC AGCGACTAAC 8580 

ATATTACGAA GGCGTTGCCA ATCAAAAATG TAGATTCCCA TAGAAGCTTT TGTAGATTTA 8640 

GGTTGAGCTG GTTTTTCTTC AAATTCAACA ATACGATTGT TAGCATCTGT GTTCATGATA 8700 

CCAAAACGGC TTGCTTCTTT AAGAGGGACG TCTAAAACTG CTACTGTCAA GCTGGCATTA 8760 

TTATCCTTAT GAGACTGGAG CATATCATCA TAGTCCATTT TGTAGATGTG ATCCCCAGAC 8820 

AAAATCAAGA CATACTCAGG ATTGACACTG TCGATATAGT CGATATTTTG GTAAATAGCG 8880 

TGACTAGTCC CCTCAAACCA ACGATTTCCT TCACTTGCAG AATAAGGTTG AAGAATAGAG 8940 
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ACACCTGAAT 


TAATACCGTC 


TAGTCCCCAG 


CTTGAACCAT 


TCCCAATATG GTTGTTGAGA 


9000 


GCAAGTGGTT 


GATACTGTGT AACGACCCCA ACATTGTGAA TCCCTGAGTT GGCACAGTTT 


9030 


GATAGGGCAA 


AGTCAATGAT 


ACGGTAGCGC 


CCACCAAATT 


GCACAGCTGG TTTTGCGATG 


9120 


CTTTGAGTGA 


GTTTACCGAG 


ACGAGTTCCT 


TGCCCACCAG 


CAAGAATCAA AGCTAACATT 


oi on 


TCATTTTTCA 


TTTTCTACTC 


CTTTTTGGTT 


TTTATTTGTG 


ACGGTTTTAG TAGATTTCAA 




GCGACGTTTG 


ATTTTCCATA 


CACTTGCTCC 


CATAGCCGGT 


AGGGTAAAGG TTAAGGTCTG 




CTCATAATCT 


TTCCATAGTC 


CTTCTTGCGT 


TTGAACAGTT 


TGATTATGTT CTTTCCAAAC 


9360 


GCCTCCCCAC 


TCTTCCAACT CAGTATTCCA TACTTCTTCG TAAATTCCTG CAACGGGTAG 


9420 


TCCGATTGTA 


AAATCTTTCC 


GCTCAACAGG TACCATATTA AAGATACAGA CTAACATTTC 


9480 


TCCCTTTTTA 


CCCTTACGAA 


TAAAGGAAAG 


AACACTCTGG 


TCTCGATTAT CCGCATCAAT 


9540 


GATTTCAATA 


CCATCATAGC 


TGGTATCAAT 


TTCCCACAGA 


CAGCGATGAT CTTTGTAAAA 


9600 


CTGGTTTAGC 


TGAGAAGCGA 


AATACTTCAT 


CTTAGCATTC 


ATTGGGTCTT CTAGGTTAGA 


9660 


CCATTCCAAC 


TGTTCTTCAG 


ATTTCCATTC 


TAGGAATTGA 


CCGTATTCGC TACCCATGAA 


9720 


GAGCAATTTC 


TTACCAGGGT 


GACAAATTTG 


GTACGTATAG 


AGATTGCGCA AGCCTGCGAA 


9780 


TTGATTGTAA 


CGATCTCCCC 


ACATCTTATG 


CATCATACTC 


TTCTTGCCAT GAACCACTTC 


9840 


ATCGTGCGAG 


AATGGCAAGA 


GATAATTCTC 


CTTGAAAACA 


TACATAAAGC TGAAAGTCAC 


9900 


CAGGTTAAAG 


TCATATTTAC 


GATAGATCGG 


ATCTTCTTCG 


TAGAAACGGA GGATATCATT 


9960 


CATCCAGCCC 


ATGTTCCATT 


TGTAGTCAAA 


TCCTAGACCA 


CCAATCTCTT TCATTCCCGT 


10020 


AATCTTGATC 


GCAGACGAAC 


TTTCTTCTGC 


AATCATCATC 


ACATCTGGAT ATTCTAACTT 


10080 


AATAACCTCA 


TTCAAGCGCT 


GAAGGAAATA 


ATAACCTTCA 


TAGTTGAGAT TTCCGCCATC 


1U14U 


TTTATTAGGT 


GTCCATGGAG 


CATCATCATA 


GTCCAAATAG 


AGCATGTTGC TAACAGCATC 




CACACGAATA 


CCATCCAAAT 


GATAGACATC 


AATCCAATGC 


TTAATGCAAG AAATTAAGAA 


10260 


GGACTGGACT 


TCATTTTTTC 


CAAGGTCAAA 


ATTAAGGGCA 


CCCCAACCAT GGTTATGAGC 


10320 


CTTATTATGG 


TCTTGGTATT 


CAAAAGTCGG 


TGTCCCATCA 


TAATAGGCTA AGGCATCATC 


10380 


GTTGATGGTA 


AAGTGACTGG 


TACCCAGTCC 


ACAATAACCC 


CAATATTATG GGTATGACAC 


10440 


TCCTCGACAA 


AATCTTGAAA 


CTCCTCTGGT 


CGGCCATAAG 


CATGCTCTAA AGCGAAGTAA 


10500 


CCCATAAGCT GATACCCCCA ACTCAAGCCC AAAGGATGGG ACATCAAGGG CATAAACTCA 


10560 


ATATGAGTAT 


AGTTCATTTC 


AACGAGATAA 


GGAATGAGTT 


CATCCTTGAG CTGGGCAAAA 


10620 


CTATAAGGAC 


TGCCATCAGA 


ATTTCTTTTC 


CATGATCCAG 


CGTGAACTTC ATAAATATTG 


10680 


ACAGGACGCT 


CTTCAAAGCC 


CCAACGTTTT 


CTTCGTGCCA 


GCCAAAGTCC ATCCTTCCAT 


10740 
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TTCTTCTCAG GAAGCTCTGT TACGATTGCC CCTGTTCCTG GACGAGCCTC ATACCTGACA 10800 

GCAAAAGGGT CAATCTTCAT CAGTTGATGA CCATTTTGAC GTGTGACATG ATATTTGTAA 10860 

ATATGCCCTT CTTGAGCCAT ATTGGTAAAG ACTTCCCAGA CCCCAAAATC ATTTCTTACC 10920 

ATTGGAATCT GATTTTCAAT CCAGTTGGTA AAATCACCAA CCAAGTGAAC AGCCTGAGCA 10980 

TTAGGTGCCC AAACACGGAA GGTATAGCCA TGCTCTCCAT TTAGTTCTTC CCTATGTGCT 11040 

CCTAGATAAT GTTGGAGATA AAAATTTTCA CCCGTCATAA AGGTTTTTAA TGCTTCTCTA 11100 

TTATCCATAT ACTCCCCTTC TCCTGTAAGC GTTTTCTATG TTTTTATTAT ACTACCTTTT 11160 

TAGAGAAGAT TCAAGTAAAT TACTATACTT CTTTAATTAT TTTGAAAATC TACAACAAGT 11220 

TCACTTACTC GTTCAATTGT AAATCAATAT TTTTTCAAAA AATTGCGAAA ACGCCTTTCT 11280 

TTTTCTACTA TAGTGAAATG AAATAAAACA TGCGCAAATC GATTAAGGAA TTTAATCTAA 11340 

TTTCTAACAA TGTCTTAGAA ATCAAAGTGT ACTATTTTAA CTCC 11384 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7577 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

TGTTGATTTG TTACTAGACG TTGACCAACG TCCTTCGGCT GGAAAAGGAA TTCTCCTTAG 60 

TTTCCAACAC GTTTTCGCCA TGTTTGGTGC GACCATCTTG GTACCATTGA TTTTGGGAAT 120 

GCCTGTATCT GTTGCCCTTT TTGCTTCAGG TGTTGGAACA CTCATCTACA TGATTGCTAC 180 

TGGTTTTAAA GTTCCAGTTT ATCTAGGTTC TTCATTTGCC TTTATCACAG CTATGTCACT 240 

GGCTATGAAA GAAATGGGGG GGGATGTATC TGCTGCCCAA ACAGGGGTTA TCTTGACTGG 300 

TTTGGTCTAT GTCCTTGTTG CTACCAGCAT CCGATTTGTA GGAACAAAAT GGATTGATAA 3 60 

ACTCTTGCCA CCAATCATTA TCGGTCCTAT GATCATCGTT ATCGGTCTTG GACTTGCAGG 420 

TTCAGCTGTT ACCAATGCAG GTCTTGTAGC AGACGGAAAT TGGAAAAATG CTCTGGTAGC 480 

CGTTGTTACT TTCCTAATTG CTGCCTTTAT CAATACAAAA GGAAAAGGCT TCCTACGAAT 540 

CATTCCATTC CTCTTTGCCA TTATCGGTGG TTACCTTTTC GCACTAACTC TTGGCTTGGT 600 

TGACTTTACA CCAGTTCTTA AAGCCAACTG GTTCGAAATT CCTGGTTTCT ACTTGCCATT 660 

TAGCACAGGT GGTGCCTTTA AAGAGTACAA TCTTTACTTT GGTCCAGAAG CCATCGCTAT 720 
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CTTGCCAATC GCTATCGTAA CAATTTCTGA ACATATCGGA GACCATACTG TTTTGGGTCA 780 

AATCTGTGGT CGTCAATTCT TAAAAGAACC AGGTCTTCAC CGTACTCTTC TTGGTGACGG 840 

TATCGCAACT TCTGTTTCTG CCTTCCTTGG TGGACCAGCC AATACAACTT ACGGAGAAAA 900 

TACAGGGGTT ATCGGTATGA CTCGTATCGC TTCTGTCTCA GTTATCCGTA ACGCTGCCTT 960 

CATCGCGATT GCCCTCAGCT TCCTTGGTAA ATTCACTGCC TTGATTTCAA CTATTCCAAA 1020 

CGCTGTACTT GGTGGTATGT CAATCCTTCT CTATGGGGTT ATCGCCAGCA ATGGTTTGAA 1080 

AGTCTTGATT AAAGAACGTG TTGATTTCGC TCAAATGCGA AACCTCATCA TCGCAAGTGC 1140 

TATGTTGGTT CTTGGACTTG GAGGAGCTAT CCTTAAACTT GGTCCAGTTA CACTTTCAGG 1200 

TACTGCCCTT TCAGCCATGA CAGGAATCAT CTTGAACTTG ATCTTGCCAT ACGAAAATAA 1260 

AGACTAAGAG TCTAAATACA CCTAATCCAC TCAGACAGCT GAGTGGATTT TTCGTATACC 1320' 

ATAATAAAAG TGTCTTAACA AAATTATTAA AATCAAAAAA CGTATAATAT CAGATATTCT 1380 

AAAACCTTGA TACTGTACGT TTTATCATAG AAATTTTTAC TTTATTTTCT CATCAAATGA 1440 

GATTTGCATC AATCTCTTGT CTTACTTGCG TTTCTTCTTC GCTTTCTTCA TTTTGTTAGC 1500 

CATACGTTTC ATGGACTGTT TCATGGCAAA TTCACCAATT TTACCTTTCA AACCGCCACC 1560 

AAACATCTGG CTCATATCTG GCATTCGTGC TCCTCCGAGA GCTGATAAGT CAGGCATACC 1620 

GCCTTGTCCC ATCATTCCTT CAAGGGCAGA CATATCCATT CCTCCCATAT TTGGCATATT 1680 

, TTTAGGAAGG TTATTTGGAT TAATCCCCAT TTGCTTCATC ATTTTATTCA TATCCCCAGA 1740 

CATAACACCC TGCATGAGCT GTTTAGCCTG GTTAAAGTCC TTGATGAATT TATTGACTTC 1800 

GACGAATGTA TTTCCAGAAC CAGCAGCAAT ACGACGGCGA CGGCTTGGAT TTAACAAATC 1860 

TGGGTTTTCA CGCTCTTCAG GTGTCATCGA AGACACAATG GCACGTTTAC GAGCAATCTG 1920 

GCGTTCATCC ACCTTCATGT TTTGAAGGGC TGGATTGTTG GCCATACCTG GAATCATCTT 1980 

GAGCAAGTCT TCCATCGGCC CCATATTTTG CACCTGATCT AATTGATCGA TGAAATCATT 2040 

AAAATCAAAG GTGTTTTCGC GCATCTTCTC AGCCATTTCA AGGGCTTTTT GTTCATCGTA 2100 

TTCCTGAGAA GCTTTCTCAA TCAAAGTGAG CATATCCCCC ATACCAAGGA TACGGCTAGA 2160 

CATGCGGTCT GGGTGGAAGG TTTCAATGTC CGTAATCTTT TCACCTGTAC CAGTGAACTT 2220 

GATTGGTTTT CCAGTAATGT GACGAACAGA CAGAGCAGCA CCACCACGAG TATCGCCATC 2280 

AATCTTGGTA AGGATGACCC CAGTCACTTC CAACTGAGCA TTAAACTCAC GCGCAAGATT 2340 

GGCTGCTTCC TGACCAATCA TAGCATCAAC GACAAGCAAG ATTTCATTTG GTTGAGCCAA 2400 

TGCTTTCACA TCACGAAGCT CATTCATGAG GAGCTCATCA ATCTGCAAAC GACCCGCAGT 2460 

ATCAATCAAG ACATAGTCGT TATGATTAGT TTGGGCTTGC TCCAAACCTT GACGTACAAT 2520 
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CTCAACAGCT GGTACTTCTG TTCCAAGTGC AAAGACAGGC ACATCAATCT GTTGTCCCAA 


2580 


GGTCTTAAGC TGGTCAATGG CAGCTGGACG 


ATAAATATCC 


GCCGCAATCA 


TCAAAGGACG 


2640 


AGCATTTTCT TCTTTCTTGA GTTTGTTGGC 


CAATTTACCA 


GCAAAGGTTG 


TTTTACCAGC 


2700 


CCCTTGTAAA CCAACCATCA TGATGATGGT 


TGGAATCTTA 


GGTGACTTGA 


TAATTTCTGC 


2760 


CGTATCAGAA CCTAAAACGG CTGTCAATTC 


CTCATCAACG 


ATTTTAATAA 


TCTGTTGCGG 


2820 


AGGATTAAGT GTATCAATGA CCTCATGCCC 


GACTGCACGC 


TCACGAACTT 


TCTTGATAAA 


2880 


GTCCTTTACA ACAGGCAAGG CAACGTCGGC 


CTCGAGCAAG GCCAAGCGAA 


TTTCTTTGGT 


2940 


TGCCTCTTGG ACATCAGATT CAGAGATTTT TCCTTTTTTA CGTAGATTTT TAAAGACGTT 


3000 


CTGCAAACGT TCTGTTAAAC TTTCAAATGC 


CATTTTTCTT 


CCTCTTATTC 


TCTATTATCA 


3060 


ATGCTTGTTA AAATTTCTAT CTGCTCCTGC 


AGAAAGTCAT 


CCTTGGGATA 


GCGCTCCAAA 


3120 


ATCTGATCAA AAATCTGACT GCGGACAATA TAGTCCGAGT ACATGTGCAA TTTCATCTCA 


3180 


TAATCTTCCA GAATCTTTTC TGTTCGCTTG 


ATATTGTCAT 


AGACAGCCTG 


ACGACTGACA 


3240 


CCGAACTCCT CGGCAATTTC AGCAAGGCTG 


TAATCATCAG 


CGTAGTAGAG 


CTCGATATAA 


3300 


TTCATTTGCT TATCTGTCAA AAGCGCCGCA 


TAAAATTCAA 


AGAGCGCATT 


CATACGATTG 


3360 


GTTTTTTCGA TTTCCATAAC TTTTATTATA 


CCAAAAATTA 


GCCTAATCTA 


CCACACTAGG 


3420 


AAGCCGATCC AAGAAGATAG ATAGCTAAAT 


TTGAAAAAGA 


CATGAGCCTA 


GCCCCAAGTA 


3480 


ATTTCCAATT GATAGCTGGC AAAGGGATGT 


CCCTCTTGAT 


TTTGTAGTTG 


ATAATCTAGT 


3540 


TCAATCTTTT GCCTATCAAC TTGATAATGG 


CTCGTTTGGA 


TGATAAAGTC 


CTGCATGCCC 


3600 


ATAGGTGTAG GAATATAGGC TAAACTATCG 


CTATCCTTTA 


GAAAGCGCAT 


AATGGTCTTG 


3660 


GGATTAGAAA ATCGGCTCAT CACAAGTTCT 


TGACCATGAA 


ATTTAATCAC 


TACTTTTTCC 


3720 


TTTTCCTCAT TATAGAAAAG CAGGTAGCTA TAATCTCCTT TTTCATGCAC TTCCACATCA 


3780 


TAAAGCTGGT CAATCACTTC CAACTGCTCA 


TCAAACTGAA 


TCGTATTTCG 


CATCCGAATC 


3840 


TTCACATCAG GCCCTCTTTC TTGTCTCTTG TCCTACTATT TTACCAAAAA 


GAGCAGGATT 


3900 


TTGCTATAAT GGTCATATGA ACGAAAAAGT 


ATTCCGTGAC 


CCTGTTCACA 


ACTACATCCA 


3960 


TGTCAATAAT CAAATCATCT ATGACTTGAT 


TAATACAAAA GAATTTCAGC GTTTGCGCCG 


4020 


GATCAAACAA CTGGGAACTT CCAGTTATAC CTTCCACGGT GGAGAACACA GTCGCTTCTC 


4060 


TCACTGTCTA GGAGTCTATG AAATTGCACG 


ACGCATCACA 


GAGATTTTCG 


AAGAAAAATA 


4140 


TCCTGAGGAA TGGAATCCTG CCGAGTCTCT 


CTTGACCATG 


ACCGCTGCTC 


TCCTACACGA 


42O0 


CCTTGGGCAT GGTGCCTACT CCCATACTTT 


TGAACATCTC 


TTTGATACAG 


ACCATGAAGG 


4260 
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CATTACTCAG GAGATTATTC AAAATCCTGA GACAGAGATT CACCAAGTCC TGCTACAAGT 4320 
GGCACCTGAT TTCCCAGAAA AGGTGGCCAG TGTCATTGAC CATACCTATC CTAATAAGCA 4380 
GGTCGTGCAG CTCATTTCTA GTCAGATTGA CGCAGATCGC ATGGACTATC TCTTGCGCGA 4440 
CTCCTATTTT ACAGGAGCAT CCTATGGGGA ATTTGACCTG ACTCGAATCC TCCGAGTCAT 4S00 
TCGTCCTATC GAAAATGGTA TCGCCTTTCA GCGCAATGGC ATGCACGCCA TCGAAGACTA 4560 
CGTCCTCAGT CGCTACCAGA TGTACATGCA GGTTTATTTC CACCCCGCAA CACGCGCCAT 4620 
GGAAGTTCTC CTACAGAATC TTCTCAAACG CGCCAAGGAA CTCTATCCTG AGGACAAGGA 4680 

TTTCTTTGCC CGAACTTCTC CACACCTCCT GCCTTTCTTC GAAAAAAATG TGACCTTGAC 4740 

TGACTATCTG GCTCTGGATG ATGGCGTGAT GAATACCTAC TTCCAGCTTT GGATGACCAG 4800 

TCCTGACAAG ATTCTTGCAG ATTTATCGCA TCGCTTTGTC AACCGCAAGG TCTTTAAATC 4860 

, CATTACCTTT TCACAAGAGG ACCAAGATCA ACTTACTAGC ATGAGAAAAT TGGTTGAGGA 4920 

TATCGGCTTT GATCCCGACT ACTACACTGC CATTCATAAG AACTTTGACC TCCCTTATGA 4980 

TATCTATCGT CCCGAATCTG AAAACCCACG GACACAGATT GAGATTTTAC AAAAAAATGG 5040 

AGAACTGGCC GAACTCTCTA GCCTGTCTCC TATCGTCCAA TCCCTTGCTG GCAGTCGCCA 5100 

CGGAGATAAT CGCTTTTATT TTCCAAAAGA AATGTTGGAC CAAAACAGCA TCTTTGCAAG 5160 

CATTACCCAG CAATTTTTAC ACTTGATTGA GAACGATCAT TTTACCCCAA ATAAAAACTA 5220 

GAAGAGGAAA TTTATGAGTA TTAAACTAAT TGCCGTTGAT ATCGACGGAA CCCTTGTCAA 5280 

CAGCCAAAAG GAAATCACTC CTGAAGTTTT TTCTGCCATC CAAGATGCCA AAGAAGCTGG 5340 

TGTCAAAGTC GTGATTGCAA CTGGCCGCCC TATCGCAGGC GTTGCCAAAC TTCTAGACGA 5400 

CTTGCAGTTG AGAGACGAGG GGGACTATGT GGTAACCTTC AACGGTGCCC TTGTCCAAGA 5460 

AACTGCTACA GGACATGAGA TTATCAGCGA ATCCTTGACT TATGAGGATT ATCTAGATAT 5520 

GGAATTCCTC AGTCGCAAGC TCGGTGTCCA CATGCATGCC ATTACCAAGG ACGGTATCTA 5580 

TACTGCAAAT CGCAATATCG GAAAATACAC TGTACACGAA TCAACCCTCG TCAGCATGCC 5640 

TATCTTCTAC CGTACCCCTG AAGAAATGGC TGGCAAAGAA ATTGTTAAAT GTATGTTTAT 5700 

CGATGAACCA GAAATTCTCG ATGCTGCGAT TGAAAAAATT CCAGCAGAAT TTTACGAGCG 5760 

CTACTCCATC AACAAATCTG CTCCTTTCTA CCTCGAACTC CTTAAAAAGA ATGTAGACAA 5820 

GGGTTCAGCC ATTACTCACT TGGCTGAAAA ACTCGGATTG ACCAAAGATG AAACCATGGC 5880 

AATCGGTGAT GAAGAAAATG ACCGTGCCAT GCTGGAAGTC GTTGGAAACC CCGTTGTCAT 5940 

GGAAAATGGA AATCCAGAAA TCAAAAAAAT CGCCAAATAC ATCACCAAAA CAAATGACGA 6000 

ATCCGGCGTT GCCCATGCCA TCCGAACATG GGTACTGTAA AAGTATCATT TTTCAATAAG 6060 
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AATTGATTAG CAATAAAATC 


CAATGAATTT 


TTTTAGCAAA 


CTATTTAATT 


TAAAACAAAA 


6120 


TAATCATAAT. AGAGACACAA ATTCTGATTG 


TAACAATTTT 


TACCTAAACG 


AATTAGAATG 


6180 


TGGCCTTACT CCTGGGCAAC 


TCATACTCAT 


AGATTGGAGT 


CAAAAAACAG 


GGAGAAATTA 


6240 


TAATTTCCCA AGATATTTTA 


AATACTCTCT 


TCAAATTGAC 


CCTGAATCTA 


CACACAATGA 


,6300 


ATTATACAAA TTAGGATACT 


TCACTAAAAA 


TAAGACTTTA 


TCATATCTTA 


CAGTAGTAGA 


6360 


ATTAAAAACT ATATTATCTA AACATAATTT AGCTACTTCT GGAAAAAAAG CAGAATTAAT 


6420 


TACAAGAATA ATTAATAATG 


TTAACATTGA 


CAATTTAGAT 


ATTCCGTTCG 


AATTTAAACT 


6480 


AACAAAAGAA GCACAAAATC 


TTATTATCGA 


ACATAGTGAC 


TATATCAAAG 


CATACTATGA 


6540 


TAAAGACATA ACTATGGAAG ATTATTGTAA AGAAAAAAAC AATATGTCTT TTAAAGCAAG 


6600 


TTTTGGTGAT ATAAAATGGA GTCTCTTAAA TAAACAAGCT CATAGGAATA CTGTATCAGG 


6660' 


AGATTTTGGA TGCTTATCTA ACACACGAAA GGCTCAGGGA AGACATTTGG AACAAGAAGG 


6720 


TAATATTAAA CATGCTTTAA TATATTACAT AGAATCTTTG ATAATTACTA TTTCAGGATT 


6780 


AGAAAACAAT TTTTCAGCCA 


CTGATTATCC 


AGTATATTAT 


CCCGATTCGA 


TACCTGACTA 


6840 


CTCACTAAAA CATATTCAAA 


CATTAATGGA 


ATCATTATCT 


GATGACGATT 


ATGATTTTGC 


6900 


TTTTGATGAA GCATTATTTC 


GCTTCTCAAT 


TTTGAATGCA 


AATCATTTTT 


TATCTAAGGA 


6960 


AGATATTGAC TATTTAAGAG 


TTAATTTACC 


TCGTTCCACT 


GCTGAAGAAA 


TAAACAATTA 


7020 


CTTAAAGAAA TATGAATGTT 


ATAGTCCTTT 


AAATAATTTA 


GAACTTGACG 


ATTTTGAATA 


7080 


AATTGACTAT ACAAACATTT 


ATATACTCGA 


TATAGTCTCA 


ATTTTATCTG 


ATGATTGCCC 


7140 


AAATTTTTCA ATAATAAAAC 


GCATAATATT 


ATGGAGACAA 


TCCCCTATAT 


TATGCGTTCT 


7200 


TTTAATATCA AAGACTTTTT 


GACAAACTTC 


TTTGATATCT 


AATTACATGG 


CCCCTGCAGG 


7260 


AATCGAACCT GCAACTACTC 


CTTAGGAGGG 


AGTTGTTATA 


TCCATTGAAC 


TAAGGGAGCT 


7320 


AGATAAAAAC TCTGCTAAAT GAGCAGAGTT TTTTAGTCGA ATTAACGACG GATTTCTTTG 


7380 


ATACGAGCTG CTTTACCTTG 


AAGAGCACGC 


AAGTAGTACA 


ATTTCGCACG 


ACGTACTTTA 


7440 


CCGTAACGAA CAACTTCGAT TTTTTCAACA 


CGTGGAGTGT 


GGATTGGGAA 


GATACGCTCA 


7500 


ACACCTACAC CGTTAGAGAT 


TTTACGAACT 


GTGTAGTTTT 


CTGAGATTCC 


AGCACCTTTA 


7560 


CGTGCGATAA CAACACG 










7577 


t2) INFORMATION FOR SEQ ID NO: 47: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4945 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

CCTCGCTGAT GATTGGTGCT GTTTTATTTG CTGGTCCAGC CTTGGCTGAA GAAACTGCAG 60 

TTCCTGAAAA TAGCGGAnCT AATACAGAGC TTGTTTCAGG AGAGAGTGAG CATTCGACCA 120 

ATGAAGCTGA TAAGCAGAAT GAAGGGGAAC ATGCTAGAGA AAACAAGCTA GAAAAGGCAG 180 

AAGGAGTAGC GATAGCATCT GAAACTGCTT CGCCAGCAAG CAATGAAGCT GCAACTACTG 240 

AAACTGCAGA AGCAGCTAGC GCAGCTAAAC CAGAGGAAAA AGCAAGTGAG GTGGTTGCAG 300 

AAACACCATC TGCAGAAGCA AAACCTAAGT CTGACAAGGA AACAGAAGCA AAGCCCGAAG 360 

CAACTAACCA AGGGGATGAG TCTAAACCAG CAGCAGAAGC TAATAAGACT GAAAAAGAAG 420 

TCCAGCCAGA TGTCCCTAAA AATACAGAAA AAACATTAAA ACCAAAGGAA ATCAAATTTA 480 

ATTCTTGGGA AGAATTGTTA AAATGGGAAC CAGGTGCTCG TGAAGATGAT GCTATTAACC 540 

GCGGATCTGT TGTCCTCGCT TCACGTCGGA CAGGTCATTT AGTCAATGAA AAAGCTAGCA 600 

AGGAAGCAAA AGTTCAAGCC TTATCAAACA CCAATTCTAA AGCAAAAGAC CATGCTTCTG 660 

TTGGTGGAGA AGAGTTCAAG GCCTATGCTT TTGACTATTG GCAATATCTA GATTCAATGG 720 

TCTTCTGGGA AGGTCTCGTA CCAACTCCTG ACGTTATTGA TGCAGGTCAC CGTAACGGGG 780 

TTCCTGTATA CGGTACACTC TTCTTCAACT GGTCTAATAG TATTGCAGAT CAAGAAAGAT 840 

TTGCTGAAGC TTTGAAGCAA GACGCAGATG GTAGCTTCCC AATTGCCCGT AAATTGGTAG 900 

ACATGGCCAA GTATTATGGC TATGATGGCT ATTTCATCAA CCAAGAAACA ACTGGAGATT 960 

TGGTTAAACC TCTTGGAGAA AAGATGCGCC AGTTTATGCT CTATAGCAAG GAATATGCTG 1020 

CTAAGGTAAA CCATCCAATC AAGTATTCTT GGTACGATGC CATGACCTAT AACTATGGAC 1080 

GTTATCATCA AGATGGTTTG GGAGAATACA ACTACCAATT CATGCAACCA GAAGGAGATA 1140 

AGGTTCCGGC AGATAACTTC TTTGCTAACT TTAACTGGGA TAAGGCTAAA AATGATTACA 1200 

CTATTGCAAC TGCCAACTGG ATTGGTCGTA ATCCTTATGA TGTATTTGCA GGTTTGGAAT 1260 

TGCAACAGGG TGGTTCCTAC AAGACAAAGG TTAAGTGGAA TGACATTTTA GACGAAAATG 1320 

GGAAATTGCG CCTTTCTCTT GGTTTATTTG CCCCAGATAC CATTACAAGT TTAGGAAAAA 1380 

CTGGTGAAGA TTATCATAAA AATGAAGATA TCTTCTTTAC AGGTTATCAA GGAGACCCTA 1440 

CTGGCCAAAA ACCAGGTGAC AAAGATTGGT ATGGTATTGC TAACCTAGTT GCGGACCGTA 1500 

CGCCAGCGGT AGGTAATACT TTTACTACTT CTTTTAATAC AGGTCATGGT AAAAAATGGT 1560 

TCGTAGATGG TAAGGTTTCT AAGGATTCTG AGTGGAATTA TCGTTCAGTA TCAGGTGTTC 1620 
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TTCCAACATG GCGCTGGTGG CAGACTTCAA CAGGGGAAAA ACTTCGTGCA GAATATGATT 


1680 


TTACAGATGC 


CTATAATGGC 


GGAAATTCCC 


TTAAATTCTC 


TGGTGATGTA 


GCCGGTAAGA 


1740 


CAGATCAGGA 


TGTGAGACTT 


TATTCTACTA 


AGTTAGAAGT 


AACTGAGAAG 


ACCAAACTTC 


1800 


GTGTTGCCCA 


CAAGGGAGGA 


AAAGGTTCTA 


AAGTTTATAT 


GGCATTCTCT 


ACAACTCCAG 


1860 


ACTACAAATT 


CGATGATGCA 


GATGCATGGA 


AAGAGCTAAC 


CCTTTCTGAC 


AACTGGACAA 


1920 


ATGAAGAATT 


TGATCTTAGC 


TCACTAGCGG 


GTAAAACCAT 


CTATGCAGTC 


AAACTATTTT 


1980 


TCGAGCATGA 


AGGTGCTGTA 


AAAGATTATC 


AGTTTAACCT 


AGGACAATTA 


ACTATCTCGG 


2040 


ACAATCACCA 


AGAGCCACAA 


TCGCCGACAA 


GCTTTTCTGT 


AGTGAAACAA 


TCTCTTAAAA 


2100 


ATGCCCAAGA 


AGCGGAAGCA 


GTTGTGCAAT 


TTAAAGGCAA 


CAAGGATGCA 


GATTTCTATG 


2160 


AAGTTTATGA 


AAAAGATGGA 


GACAGCTGGA 


AATTACTAAC 


TGGCTCATCT 


TCTACAACTA 


2220 


TTTATCTACC 


AAAAGTTAGC 


CGCTCAGCAA GTGCTCAGGG 


TACAACTCAA 


GAACTGAAGG 


2280 


TTGTAGCAGT 


CGGTAAAAAT 


GGAGTTCGTT 


CAGAAGCTGC 


AACCACAACC 


TTTGATTGGG 


2340 


GTATGACTGT AAAAGATACC AGCCTACCAA AACCACTAGC TGAAAATATC 


GTTCCAGGTG 


2400 


CAACAGTTAT 


TGATAGTACT 


TTCCCTAAGA 


CTGAAGGTGG 


AGAAGGTATT 


GAAGGTATGT 


2460 


TGAACGGTAC 


CATTACTAGC 


TTGTCAGATA 


AATGGTCTTC 


AGCTCAGTTG 


AGTGGTAGTG 


2520 


TGGATATTCG 


TTTGACCAAG 


CCACGTACCG 


TTGTTAGATG 


GGTCATGGAT 


CATGCAGGAG 


2560 


CTGGTGGTGA 


GTCTGTTAAC 


GATGGCTTGA 


TGAACACTAA 


AGACTTTGAC 


CTTTATTATA 


2640 


AAGATGCAGA 


TGGTGAGTGG 


AAGCTAGCTA 


AGGAAGTCCG 


TGGTAACAAA 


GCACACGTGA 


2700 


CAGATATCAC 


TCTTGATAAA 


CCAATCACTG 


CTCAAGACTG 


GCGCTTGAAT 


GTTGTCACTT 


2760 


CTGACAATGG 


AACTCCATGG 


AAGGCTATTC 


GTATCTATAA 


CTGGAAAATG 


TATGAAAAGC 


2820 


TTGATACTGA GAGTGTCAAT 


ATTCCGATGG 


CCAAGGCTGC 


AGCCCGTTCT 


CTAGGCAATA 


2880 


ACAAGGTACA 


AGTTGGCTTT 


GCAGATGTAC 


CGGCTGGAGC 


AACTATTACC 


GTTTATGATA 


2940 


ATCCAAATTC 


TCAAACTCCG 


CTCGCAACCT 


TGAAGAGCGA 


AGTTGGAGGA 


GACCTAGCAA 


3000 


GTGCACCATT 


GGATTTGACA 


AATCAATCTG 


GTCTTCTTTA 


TTATCGTACC 


CAGTTGCCAG 


3060 


GCAAGGAAAT 


TAGTAATGTC 


CTAGCAGTTT 


CCGTTCCAAA AGATGACAGA 


AGAATCAAGT 


3120 


CAGTCAGCCT AGAAACAGGA 


CCTAAGAAAA 


CAAGCTACGC 


CGAAGGGGAG 


GATTTGGACC 


3180 


TTAGAGGTGG 


TGTTCTTCGA 


GTTCAGTATG 


AAGGAGGAAC 


TGAGGACGAA 


CTCATTCGCC 


3240 


TAACTCACGC 


AGGTGTATCA 


GTATCAGGTT 


TTGATAGGCA 


TCATAAGGGA 


GAACAGAATC 


3300 


TTACTCTCCA 


ATATTTGGGA 


CAACCGGTAA 


ATGCTAATTT 


GTCAGTGACT 


GTCACTGGCC 


3360 
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AAGACGAAGC AAGTCCGAAA ACTATTTTGG GAATTGAAGT AAGTCAGGAA CCGAAAAAAG 3420 

ATTACCTAGT TGGTGATAGC TTAGACTTGT CTGAAGGACG CTTTGCAGTG GCTTATAGCA 3480 

ATGACACCAT GGAAGAACAT TCCTTTACTG ATGAGGGAGT TGAAATTTCT GGTTACGATG 3540 

CTCAAAAGAC TGGTCGTCAA ACCTTGACGC TTCATTACCA AGGCCATGAA GTTAGCTTTG 3600 

ATGTTTTGGT ATCTCCAAAA GCAGCATTGA ACGATGAGTA CCTCAAACAA AAATTAGCAG 3660 

AAGTTGAAGC TGCTAAGAAC AAGGTGGTCT ATAACTTTGC TTCATCAGAA GTAAAAGAAG 3720 

CCTTCTTGAA AGCAATTGAA GCGGCCGAAC AAGTGTTGAA AGACCATGAA ACTAGCACCC 3780 

AAGATCAAGT CAATGACCGA CTTAATAAAT TGACAGAAGC TCATAAAGCT CTGAATGGTC 3840 

AAGAGAAATT TACGGAAGAA AAGACAGAGC TTGATCGCTT AACAGGTGAG GTTCAAGAAC 3900 

TCTTGGCTGC CAAACCAAAC CATCCTTCAG GTTCTGCCCT AGCTCCGCTT CTTGAGAAAA 3960 

ACAAGGCCTT GGTTGAAAAA GTAGATTTGA GTCCAGAAGA GCTTACAACA GCGAAACAGA 4020 

GTCTAAAAGA TCTGGTTGCT TTATTGAAAG AAGACAAGCC AGCAGTCTTT TCTGATAGTA 4080 

AAACAGGTGT TGAAGTACAC TTCTCAAATA AAGAGAAGAC TGTCATCAAG GGTTTGAAAG 4140 

TAGAGCGTGT TCAAGCAAGT GCTGAAGAGA AGAAATACTT TGCTGGAGAA GATGCTCATG 4200 

TCTTTGAAAT AGAAGGTTTG GATGAAAAAG GTCAAGATGT TGATCTCTCT TATGCTTCTA 4260 

TTGTGAAAAT CCCAATTGAA AAAGATAAGA AAGTTAAGAA AGTATTTTTC TTACCTGAAG 4320 

GCAAAGAGGC AGTAGAATTG GCTTTTGAAC AAACGGATAG TCATGTTATC TTTACAGCAC 4380 

CTCACTTTAC TCATTATGCC TTTGTTTATG AATCTGCTGA AAAACCACAA CCTGCTAAAC 4440 

CAGCACCACA AAACACAGTC CTTCCAAAAC CTACTTATCA ACCGACTTCT GATCAACAAA 4500 

AGGCTCCTAA ATTGGAAGTT CAAGAGGAAA AGGTTGCCTT TCATCGTCAA GAGCATGAAA 4560 

ATACTGAGAT GCTAGTTGGG GAACAACGAG TCATCATACA GGGACGAGAT GGACTGTTAA 4620 

GACATGTCTT TGAAGTTGAT GAAAACGGTC AGCGTCGTCT TCGTTCAACA GAAGTCATCC 4680 

AAGAAGCGAT TCCAGAAATT GTTGAAATTG GAACAAAAGT AAAAACAGTA CCAGCAGTAG 4740 

TAGCTACACA GGAAAAACCA GCTCAAAATA CAGCAGTTAA ATCAGAAGAA GCAAGCAAAC 4800 

AATTGCCAAA TACAGGAACA GCTGATGCTA ATGAAGCCCT AATAGCAGGC TTAGCCAGCC 4860 

TTGGTCTTGC TAGTTTAGCC TTGACCTTGA GACGGAAAAG AGAAGATAAA GATTAAATAT 4920 

CGAAAAATCT TGTGAAATCT TTCCG 4945 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25002 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 



GACAACTCAA 


GTAGCTTTTT 


CTTATTTTGA 


AAAAGGAGAT 


CAGAGTTTAA 


CTATGTCAGA 


60 


AAAATCACAA 


TGGGGGTCGA 


AACTTGGTTT 


TATTCTAGCA 


TCTGCTGGCT 


GGCCATCGGG 


120 


CTTGGTTCCG TTTGGAAGTT TCCCTACATG ACTGCTGCTA ATGGCGGTGG 


AGGCTTTTTA 


180 


CTAATCTTTC 


TCATTTCCAC 


TATTTTAATC 


GGTTTCCCTC 


TCCTGCTGGC 


TGAGTTTGCC 


240 


CTTGGCCGTA 


GTGCTGGCGT 


TTCCGCTATC 


AAAACCTTTG 


GAAAACTGGG 


CAAGAATAAC 


300 


AAGTACAACT 


TTATCGGTTG 


GATTGGCGCC 


TTTGCCCTCT 


TTATCCTCTT 


ATCTTTTTAC 


360 


AGTGTTATCG 


GAGGATGGAT 


TCTAGTCTAT 


CTAGGTATTG 


AGTTTGGGAA 


ATTGTTCCAA 


420 


CTTGGTGGAA 


CGGGTGATTA 


TGCTCAGTTA 


TTTACTTCAA 


TCATTTCAAA 


TCCAGCCATT 


480 


GCCCTAGGAG 


CTCAAGCGGC 


CTTTATCCTA 


TTGAATATCT 


TCATTGTATC 


ACGTGGGGTT 


540 


CAAAAAGGGA 


TTGAAAGAGC 


TTCGAAAGTC 


ATGATGCCCC 


TGCTCTTTAT 


CGTCTTTGTT 


600 


TTTATCATCG 


GTCGCTCTCT 


CAGTTTGCCA 


AATGCCATGG 


AAGGGGTTCT 


TTACTTCCTC 


660 


AAACCAGACT TTTCAAAACT GACTAGCACT GGTCTCCTCT ATGCTCTGGG ACAATCTTTC 


720 


TTTGCCCTCT 


CACTAGGGGT 


TACAGTCATG 


TTGACCTATG 


CTTCTTACTT 


AGACAAGAAA 


780 


ACCAATCTAG 


TCCAGTCAGG 


AATCTCCATC 


GTAGCCATGA 


ATATCTCGAT 


ATCCATCATG 


840 


GCAGGTCTAG 


CCATTTTCCA AGCTCGATCC CCCTTCAATA TCCAGTCTGA AGGGGGACCC 


900 


AGCCTGCTCT 


TTATCGTCTT GCCTCAACTC TTTGACAAGA TGCCTTTTGG AACCATTTTC 


960 


TACGTCCTCT 


TCCTCTTGCT 


CTTCCTTTTT 


GCGACAGTCA CTTTTTCTGT 


CGTGATGCTG \ 


1020 


GAAATCAATG TAGACAATAT CACCAACCAG GATAACAGCA AACGTGCCAA ATGGAGTGTT 


1080 


ATTTTAGGAA 


TTTTGACCTT 


TGTCTTTGGC 


ATTCCTTCAG 


CCCTATCTTA 


CGGTGTCATG 


1140 


GCGGATGTTC ACATTTTTGG TAAGACCTTC TTTGACGCTA TGGACTTCTT GGTTTCCAAT 


1200 


CTCCTCATGC 


CATTTGGAGC 


TCTCTACCTT 


TCACTTTTTA 


CAGGCTATAT 


CTTTAAAAAG 


1260 


GCTCTTGCAA TGGAGGAACT CCATCTCGAT GAAAGAGCAT GGAAACAAGG ACTGTTCCAA 


1320 


GTCTGGCTCT 


TCCTTCTTCG 


TTTCTTCGTT 


TCGTCATTCC 


AATCATCATC 


ATTGTGGTCT 


1380 


TCATTGCCCA 


ATTTATGTAA 


TCAAAAAGGA 


CTTGAGTAGT 


GAACTCAGGC 


CCTTTCTTTT 


1440 


TATGGATGGC 


TAACAATCAA 


TTCCAAACCT 


TGCCCTTCCA 


GAGTCCAAGC 


TTCAACATCA 


1500 


CTTGGTAGGA 


TAAAGTGGCT 


GCCTTTTTGA 


ATTGGATAAT 


TTTTCCCGTC 


AACAGTTAGC 


1560 
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TGACCTTGAC CAGCCAAGAC ACTCAATAAG CTGTAGTCAG CTGTCTTTTC AAAGTCAACT 1620 

TTTCCAGTAA TTTCCCACTT GTAAACTGCG AAGAAATCAT TAGATACAAG GAGAGTGGAA 1680 

CGCAAATCAT CTGCTTTAAC AGTTACAGGA CGGCTATTTG CTGGCTCACC AATGTTCAAG 1740 

ACATCGATGG ATTTTTCAAG ATGAAGTTCA CGCAAGTTGC CTTTGTCATC CTTGCGGTCA 1800 

AAGTCATAGA CGCGATAGGT GGTATCGCTA GACTGCTGGG TTTCAAGGAT TAAGATACCC I860 

GCCCCGATAG CGTGCATAGT CCCGCTTGGT ACATAGAAGA AATCTCCAGC CTTAACAGGG 1920 

ACTTTGGTCA ACAAGTCATC CCAGTTCTTG TCGTCGATTT GCTGGCGGAG TTCTTCTTTT 1980 

GACTTGGCAT TGTGACCGTA GATAATCTCT GAACCTTCAT CCGCTGCGAT AATGTACCAG 2040 

CATTCTGTTT TTCCGAGTTC GCCTTCATGC TCGAGTCCAT AAGCATCGTC TGGGTGAACT 2100 

TGGACACTGA GCCAGTCGTT GGCATCGAGG ATCTTGGTCA AAAGTGGAAA TACAGGTTCT 2160 

GGACGATTGC CAAATAATTC ACGGTGTTCC GCATACAAAG TAGCAAGATC TGTTCCCTCG 2220 

TAACGACCAT TGGCAACTTT AGAGACTCCA TTTGGATGGG CTGAGATGGC CCAATATTCT 2280 

CCGATTTTTT CACTTGGGAT GTCGTAGCCA AACTCATCAC GTAGCTTGGC TCCACCCCAG 2340 

ATTTTTTCTT GCATAACTGA TTGTAAAAAT AATGGTTCTG ACATGTCGAT CTCCTGTCTG 2400 

ATTTTTCTCC CCTCATTATA GCAAAAAAAG AGTTCGAATT GAACTCTTTT TTACATCTTA 24 60 

TAAAGCAGGG AGAAGATTTT ATAAAAATAG T AAACAAATG TGCTCTACCC GATGCTTGCA 2520 

CCATTGCTAT AAATGACATC CTTGTACCAA TAGAAGGACT TCTTCTTGCT ACGTTTGAGA 2580 

GCTCCGTTTC CTACATTATC TCGATCTACA TAGATAAAGC CATAGCGCTT ATTCATTTCC 2640 

CCTGTGCCAG CTGAAACCGG ATCGATACAG CCCCAAGTCG TATAACCAAG CAAGTCAACC 2700 

CCGTCTTGGT AAATGGCATC TCGCATGGCC TTGATGTGGG CCTCTAAGTA AGTAATCCGA 2760 

TAGTCATCTG CTACATAACC ATTCTCATCC GGTGTATCCA TAGCACCGAG TCCATTTTCT 2820 

ACGATAATAC TAAACTAAAA TCAAAAAGCA TTATATAATA GTGATATGAA ATCAACTAAA 2880 

GAAGAAATCC AAACCATCAA AACACTTTTA AAAGACTCTC GTACAGCTAA ATATCATAAA 2940 
CGGCTTCAAA TCGTTCTATA GTAAAATGAA ATAAGAACAG TACAAATCGA TCAGGACAGT • 3000 

CAAATCGATT TCTAACAATG TTTTAGAAGT AGGGGTGTAC TATTCTAGTT TCAATCTACT 3060 

ATATTTCGTC TGATGGGCAA ATCTTATAAA GAGATTATAG AACTTTTATA GTAGTTTGAA 3120 

ATAAGATGTG AACAACTCTA TCAGGAAAGT CAAATTAATT TATAGAAATA TTTTAGCAGC 3180 

CAAGGTGTAC TGTTATAGAT TCAATACACT ATAGACTGTA ATCAAACAAC GATTTGGCGA 3240 

AATGTAAAAA AATATGAGGA GTTCGGACTC GACTCTCTCC TTCAAGAAAC ACGTGGTGGT 3300 

CGTAACCATG CATATATGAC AGTTGAGGAA GAGAAAGCCT TTCTTGCCCG CCATTTGAAG 3360 
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GCTACAGAGG 


CAGGAGAATT 


TGTTACAATT 


GATGCCTTAT 


TTCAGGCTTA 


TAAAAAGGAG 


3420 


TTAGGTCGTT 


CCTACACACG 


TGATGCCTTC 


TATCAACTGT 


TGAAGCGCCA 


TGGTTGGCGA 


3480 


AATATTACGC 


CACGTCCAGA 


ACATCCTAAG 


AAAGCAGACG 


CTCAAACCAT 


TGTTGCGTCT 


3540 


AAAAATAAAA 


TCTCAATCGA 


AGAAGGCAAG 


AAAGCGTTTT 


AAATATAGTA 


GACGTTTTCG 


3600 


TAAGGTTTGC 


TTGATGTACC 


AAGCTGAAGC 


TGGTTTCGGT 


AGAATCAGTA 


AACTGGGATC 


3660 


TTGTTGGGCT 


CCAATAGGAG 


TAGGTCCACA 


TATCCATAGT 


CACTATATAC 


GAGAATTTCG 


3720 


CTATTGTTAT 


GGAGCTGTTG 


ATGCCTATAC 


AGGCGAATCA 


TTTTTCTTAA 


TAGCTGGTAG 


3780 


ATGTAATACT 


GAGTGGATGA 


ACGCCTTTTT 


AGAAGAGCTT 


TCACAAGCTT 


ATCCTTTTAC 


3840 


TCGTTATGGA 


CAATGCTATA 


TGGCATAAAT 


CAAGTACCTT 


AAAGATTCCG 


ACTAATATTG 


3900 


GTTTTGCATT 


TATTCCTCCA 


TACACACCAG 


AGATGAACCC 


CATTGAACAA GTGTGGAAAG 


3960 


AGATTCGTAA 


ACGTGGATTT 


AAGAATAAAG 


CCTTTCGAAT 


TTTGGAAGAT 


GTCATGAATC 


4020 


AACTCCAAGA 


TGTCATACAA 


GGATTGGAGA 


AGGAGGTGAT 


AAAGTCCATC 


GTTAATCGGA 


4080 


GATGGACTAG 


AATGCTTTTT 


GAAAGCAGAT 


GAGTATTATA 


TGCAATTTCT 


TTATATAAAA 


4140 


AGACCGGATT 


GCTCCGATCT 


TTCAATAGTT 


CATATTCTCA 


ATTTCTATTT 


TAAAAATAGC 


4200 


TAAGGTTAAC 


GTCAAATGAC 


TACGCGACCT 


ATTTCATACG 


ATAAAAATGA 


AGCACTAGAC 


4260 


CAGCAGGTCC 


TTGAACTAAT 


AAGGACTCTG 


TTCCCCAATC 


GGTTACAGTT 


GGTCCGTGTA 


4320 


AAACCTTTAT 


ACCAAGCTCG 


TTCAACCGTT 


TGTAGTTCTG 


GTCTACATCC 


TCAACCTCGA 


4380 


TATGAATAAT 


GATTCCTGAC 


TGAAAGTTTT 


CCAAAGGAAC 


CAAATGATTT 


TGTGACAACA 


4440 


TAAGGCAGTG 


ACTACCAATC 


GTAAACTGAG 


CAAAACCATC 


ATTAGCATAA 


TCTGCCTTTT 


4500 


TATCCAAGAT 


ATGCTCCAAG 


TCAGCACAGA 


CTTGGGGAAC 


ATTTGAAACG 


ATAATATGTA 


4560 


ATTGATTTAA 


ATTCATTTAC 


TCTCCTCCAT 


AAAAAGACCG 


GATTGCTCCG 


ATCTTTTAAA 


4620 


GTTCTGCTCT 


ATGAAAATCA 


AAGAATAAAG 


TCTACAAGTT 


TCATATTTGA 


TTTTCGGGGA 


4680 


GAGGAATTAT 


TTAATTGCGC 


GTGATTGCAA 


TCCTTCTTCT 


TCCAAGAAGA 


GACGGAATGG 


4740 


TACGAGTTCT 


TCTGCTTCGT 


ATTTTTCCTT 


GAAGGCTTTG 


ATAGCTTCTT 


CTGAGTGAAG 


4800 


TTTTGGATCC 


AATTCAAGTA 


CTTCTACTGG 


AAGTGGACGG 


TGTTGAGTGA 


TGCGAGCATC 


4860 


GATGACAACA 


GTTTTACCTT 


CTTTGTTCAA 


TTTAACAGCT 


TCTGCAACAA 


CTGCATCGAT 


4920 


GTCTTCGATA 


CGGTCAACTG 


TGAATCCAAC 


AGCTCCTTGA 


GCTTCCGCAA 


TTTTAGCGTA 


4980 


GTCAGCGTTT 


GTGAAGTCTA 


CACCAAACAA 


GTGTTTGTTT 


GTATCTTGGT 


ATTTGTTCTT 


5040 


GATGAAGCCG 


TACTCAGCAT 


TTGAGAAGAC 


AAGGTTGATA 


ACTGGAAGGT 


CGTATTGAAC 


5100 



WO 98/18931 



PCT/US97/19588 



434 

GTTTGTGATA ACGTCTGGGT AGCACATGTT GAATGCTCCG TCACCCATGA TGTTCCATAC 5160 

TTGGCGATCT GGATTGTCTT TCTTAGCAGC GATACCACCA GGAAGGGCAA TACCCATTGT 5220 

CGCAAAGAGT GGAGATGTAC GCCACATGTT CTTAGGTGTC ATGTGAAGGT GACGAGTAGA 5280 

TGTTTGAGTA GTGTTACCTA CGTCGATTGA GTAGATAGCG TCTTGATCAG CATGTTTGTT 5340 

GATTGCATTG TAAACTTGAT ACAATTGCAA TTCACCCTCA GTTTTACCTT CGAGTTTGTT 5400 

CATGTAATCA CGCCAGTTTT GGTTGTTCTT AACGTTTGCA CGCCACCATG GAGTTGATTC 5460 

AACTGGGTTT ACTTTGTCAA GGATAGCTTT AGCTGCTTGA CCAGCATCAC CAAGGATTGA 5520 

AGCGTCAAGG GCATGACGTT TACCAAGTTT GTAAGGGTCG ATATCGACTT GGATGAATTT 5580 

TTCAGTGTTC TTGAATGCTT CGTAAACTTC AGCAAATGGG AAGTTTGAAC CAAGGAAAAG 5640 

AACTGTGTCT GCTTCAAAGA CCACTTCGTT GGCTGGTTTC CAACCAACAC GGTAAGCAGA 5700 

ACCTGTCAAA CCTTCATAGT TCCATTCGAA AGCTTCAAAG TTTTTACCAG TTGTGATGAT 5760 

TGGTGCTTTG ATTTTACGTG ACAATTCAGT AATCACTTCA CCAGCTTTAA CACCACCAAA 5820 

TCCAGCATAG ATAACTGGGC GTTCAGCATT GTTCAAGATT TCAACAGCTT TGTCGATTTC 5880 

AACTTCGTTC AAAGCAGGAG CGATGAATGA GCGTTCGTAT GAACCTGAAC CGTAGTATGA 5940 

GTTTTCATCG ATTTCTTGGA AACCGAAGTT TACTGGAATT TCAACAACAG CTGGACCTTT 6000 

TTTAGAAACT GCAGCACGGC AGGCTTCGTC AATTACTTTT GGCAATTGCT CAGCGTAAGC 6060 

TACACGTTTG TTGTAAACAG CGATACCGTT GTACATTGGG TTTTGGTTAA GCTCTTGGAA 6120 

AGCATCCATG TTCAATTCGT TAACTGGACG TGATCCAAGG ATCGCTAGGA ATGGAGTGTT 6180 

ATCCATAGCT GCATCGTAAA CACCGTTAAT CAAGTGAGTC GCACCTGGAC CACCTGAACC 6240 

AACTGCAACC CCGATTGAGC CGCCGAATTT AGCTTGCATA ACCGCTGCAA GAGCACCTGT 6300 

CTCTTCGTGG CGAACTTGTA AGAAACGGAT ATCTTTGTCT TCAGCCAAAG CGTCCATCAA 6360 

TGAGCTGAGT GTTCCTGATG GGATACCGTA GATTGTATCT ACGCCCCATG TTTTCAATAC 6420 

GTTAAGCATT GCTGCAGATG CAGTAATTTT CCCTTGAGTC ATAATGATAA CTCTCCTTCA 6480 

ATTTTTTTAA ACTTGGAGAA TACGATTACA TAGAATTGGA AACGTTCTCC AAATTTTTAC 6540 

TATTCCACTG TATCATATTT ATGCTGACTT TTCTAAAAAT CTGCTCAAAA CTCTCTATTC 6600 

TCTATTCTAA TACAGTTTTG AAAGTTCTGT CATTTCTGTT TTATAACAAA GAAATCTAGT 6660 

CATTACTTTT AGTCTATTTT ACTAAAATTT AACAGAAGGG AACTGGTCAG AACAGATACA 6720 

GAACTAAAGG CCATGGCTAG ACCTGCCAAT TCTGGGTTGA GAGCCAGTCC AACACCTGAA 6780 

AAGACTCCTG CTGCAATCGG AATTCCGACA ACATTGTAGA TAAAAGCCCA GAAAAGATTG 6840 

AGTAGAATTC GATGAAAGGT TTTCTTACTC ATATCAAAGG CACGAACCAC TCCTAAAAGA 6900 
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TTATTGGTTG 


TCAACACCAA 


ATCTGCTGAC 


TCGATGGCGA 


TATCTGTTCC 


AGCTCCCATA 


6960 


GCAATCCCCA CATCTGCTAC ACTAAGGGCA GGAGCGTCAT TGATACCGTC 


CCCAACAAAG 


7020 


GCTACTTTCC 


CTGACTGTTG 


CAGTTTATGG 


ATTTCATGGG 


CTTTTTCTTC 


TGGCAAGACG 


7080 


CCTGCAATGA 


CCTCTTCAAT 


TCCGATTTGA 


TCTGCAATAG 


CACGCGCCAC 


ACCAGCATTG 


7140 


TCTCCTGTCA 


GCATGACTGT 


TCGGAGACCA 


CGTTTTTTTA 


GCTGACTGAT 


GGCTAGCTTA 


7200 


GCATTTTCCT 


TAGGAATATC 


TTGCAAAGCA 


AGCAAGCCTT 


TGATTTCATT 


GTCAACAGCT 


7260 


AAGAACACAA 


CTGTCTTAGC 


TTCTTTTTCT 


AGTTCTTCTA 


GTTTATCTTG 


ATAAGTATTA 


7320 


GAAATATCCA 


TGCCATCCAG 


CATTTTAGCA 


TTTCCAAGTA 


AAACTTGTTT 


TCCATTGATT 


7380 


CGCCCTGAAA 


CACCTTTCCC 


GTGCAAGGAC 


TGAAAATTTT 


CAACAGTTTG 


AAACTCAAGT 


7440 


CCAGCTTCAC 


TCGCTCGCTT 


AACGATAGCC 


TCAGCCAGTG 


GGTGTTGAGA 


AGCATCTTCC 


7500 


AAGGAGGCTG 


CCAACCCAAA 


CACTTCTACT 


TCGTCGCCGA 


TGACATCTGT 


TACCACAGGT 


7560 


TTCCCTTCCG 


TCAAAGTCCC 


GGTCTTATCA 


AAGACAAGGG 


TTTGAACTTT 


CTGGATTTGC 


7620 


TGTAAGACAG 


TTCCATTTTT 


GAGGAGAACC 


CCCATCTTGG 


CACTACGTCC 


TGTCCCCACC 


7680 


ATAAGGGCTG 


TCGGTGTTGC 


AAGTCCCAAG 


GCACAAGGAC 


AGGCGATAAT 


CAAAACCGCC 


7740 


ACTCCGTAGA 


GAAGAGAGGA 


CACAAAGCTA 


GCTCCAAGCA 


CAACCACACT 


ATCCCTGAGC 


7800 


AAGACGAACC 


AAACCCAAAA 


GGTCATGATT 


CCTAAAATGA 


CAACTACTGG 


GACAAAAATC 


7860 


CCTGAAATCT 


TATCCGTCAA 


GTCCTGAATC 


GGCGCACGAC 


TTGTCTGAGC 


TTTCTTCACA 


7920 


AAATCCACAA 


TCTGAGCCAA 


AACAGTCTCT 


GAGCCAACTT 


TTTCTGCTCT 


AAAGACAAGG 


7980 


GTTCCACTAT 


GATTGATGGT 


TGAGCCAATG 


ACAGTATCTC 


CAACTGTCTT 


GTCCACAGGC 


8040 


AGACTCTCAC 


CTGTCACCAT 


GGATTCGTCA 


ATACTAGAGA 


CACCTTCTAC 


TACGACACCA 


8100 


TCAACAGCAA 


TCTTTTCACC 


GGGACGCACT 


CGAATCAGGT 


CGCCTACCTT 


GACTTGTTCC 


8160 


AAAGGAACTT 


GGACATAACT 


ATCATCACTC 


AAGACTTCTG 


CGGTTTTAGC 


TTGCAAGTCC 


8220 


AGTAATTTCT 


CCACAGCTTG 


GGACGTATTT 


TTTCTCATTT 


TTTCCTCAAA 


AACTGCTGGC 


8280 


AAAAGAACGA 


AAAAGAGGAT 


AAATCCAGCA 


CTTTCGAAGT 


AAACAGGGAG 


ACCAGCAAAG 


8340 


AGAGCAACTA 


GGCTATAGAA 


ATAAGCCACT 


AGAGTTCCCA 


GCGCAACCAA 


GGTATCCATG 


8400 


TTGGCATTGT 


GCTTTTTAAA 


ACTGGCCCAA 


GCACTCTGGA 


TATATGGCTT 


ACCTGCAACT 


8460 


AACATAATAG 


GCGTTGTTGC 


TAGAAAGGTT 


CCCCAATGCA 


TGACTTGATG 


ACTAATGCTA 


8520 


CCTGTCAACA 


TCCCAATCAT 


GAGAATCACA 


AGAGGCACAG 


TAAAGATACT 


AGTAATCCAA 


8580 


AAACGTTGCA 


GGAGAGATAG 


AGATTTTCGA 


GTCTTCTCAA 


CGACTGTATA 


GCTTCCCTTT 


8640 
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TGCATCTTCA TGCCACAAGA AAATTCATGT CGCCCTAATT CTTGAGGCGT AAAACGAATG 8700 

ACTTTCTCCT CATCTACGCC GATTGGTTCC AAGATACCTT CTTCTTCAAA CAGAATTTCC 8760 

TTATAACAGT TTGAAGGAGT AGCACGATGA AAGGTAATCT CAGCTGGAAT TCCCTTTTGA 8820 

AGCTGGATAT GGGCTGGATG ATAGCCTTTT TCAGCTCGGA TACGGATTTT TTGAATGCCA 8880 

TTTTCTAAGC TTGCTTTCAC AATTTCTGTC ATAGTCTCCA CCTACTCTAC AATCATCTTG 8940 

CCGTGCATCA TGTTCATACC ACAAGCAAAG CCAAACTCTC CAGCCTGTTC AGGCGTGATT 9000 

TCCACTACAT ACTCTTCCCC CATTGGCAGG TTCGCATGTA CACCAAAATC TGGAAAAACA 9060 

ATTTGATCCA GACATGGTGA AGGATCCTTG CGGTCAAAGA CAATGCGTGC TGGCACTGAT 9120 

TTCTTGAGGA CAATCAACTC AGGAGTATAG CCTCCCATGA CTTCCACTCG AATCTCTTGG . 9180 

TATCCGTTTT TTTGCTGGGC TTTTTGTCCA GATTTTTCAG GCTTTTTGAA AAACCAAAAC 9240 

AAGATAAACG CGATAAGGGC AATACAAATA ATGGTTACAA TACTATTTAA CATGACGTCT 9300 

CCTTTACATA CAATTACATC TTACTTCTGT TACAGCACTT GATTTCTTCT CTGAAATCAC 9360 

AGCTTCCAAG TCTTCCAAGT CAGTCTGAGT AAATTCACAT TCTACAATCA AGTCAGCCAA 9420 

CAAATTCCTA ATCCTACGGG AACAAACCTT GTCTTTGATA TCTTGGACAA GTAAATCCCG 9480 

ACTTTGGTCT AGAGTTAAAA GGGCTGAATA AACAAAGGAC TTGCCTTCTT TTTTCCGAGT 9540 

CAAACACTCT TTATCAACCA GACGAGCCAA AAGTGTCTGA ACCGTGGACT TGGACCAGTC 9600 

AAACCGCTCT GCCAAAACCC TAATCAAATC TGTACTGGTC TGCTCCCCCT GCATCCAAAT 9660 

AATCTTCATG ACCTGCCATT CTGCATCTGA AATCTGCATT ACCATACCTC CAAAATCTAC 9720 

ATTTGTCAAT TACACTCATC AGTATACTCT TAAAATCTAC ATTTGTCAAT TATAGAAATA 9780 

ATATTTTCTT CGAAAAATAG AATTTTAATC ATTTGAAAAA CGATTTGCAG TCAAATATTA 9840 

CTATATAAAC AATAAAAATA TGCTATACTA AAGAAAAAAG AAAACAACCA CTAGGGGTGC 9900 

GTAAAGCTGA GATTAACGAC TGTTAGATCC CTCTGACTCA ATCTAGGTAA TGCTAGCTGA 9960 

TGGAAGTGGA AATGATAATG GGGACTAGCA GTCTTCTATT GCCTTTCTAA AACAGACTAG 10020 

CTTGTTCTTA AGAATACAAA CTTCAGTTGG TTGGGAGGTT TTAGATGACT TATTTACCCG 10080 

TTGCTTTGAC CATTGCAGGG ACTGACCCTA GTGGTGGTGC TGGCATTATG GCAGATTTAA 10140 

AGTCATTCCA AGCGAGAGAT GTCTATGGAA TGGCTGTTGT AACCAGTCTT GTCGCTCAAA 10200 

ATACCAGAGG TGTTCAGCTA ATCGAGCACG TTTCTCCTCA AATGTTGAAA GCCCAATTGG 10260 

AGAGTGTCTT TTCTGATATT CCACCTCAGG CTGTAAAAAC TGGAATGTTG GCTACTACTG 10320 

AAATCATGGA AATCATCCAA CCCTATCTTA AAAAACTGGA TTGTCCCTAT GTCCTTGATC 10380 

CTGTTATGGT TGCTACAAGT GGAGATGCCT TGATTGACTC AAATGCTAGA GACTATCTCA 10440 
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AAACAAACTT ACTACCTCTA GCAACTATTA TTACGCCAAA 


TCTTCCTGAA 


GCAGAAGAGA 


iosoo 


TTGTTGGTTT TTCAATCCAT 


GACCCCGAAG 


ACATGCAGCG 


TGCTGGTCGC 


CTGATTTTAA 


10560 


AAGAATTTGG 


TCCTCAGTCT 


GTGGTTATCA AAGGCGGACA 


TCTCAAAGGT 


GGTGCTAAAG 


10620 


ATTTCCTCTT 


TACCAAGAAT 


GAACAATTTG 


TCTGGGAAAG 


CCCACGAATT 


CAAACCTGTC 


10680 


ACACCCATGG 


TACTGGATGT 


ACCTTTGCTG 


CAGTGATTAC 


TGCTGAACTA 


GCCAAGGGCA 


10740 


AGAGTGTTTA 


CCAGGCAGTT 


GATAAGGCCA 


AGGCCTTTAT 


CACAAAAGCT 


ATTCAAGATG 


10800 


CCCCTCAACT 


CGGTCATGGT 


TCTGGTCCAG 


TCAACCATAC 


AACTTTTAAA 


GATTAAGAAA 


10860 


AAAAACTCTC 


TAGTTCCCAC 


TTTAAGGGAA TTAGAGAGTT 


TTTATACTCf 


TCGAAAATCT 


10920 


CTTCAAACTA 


CGTCAGCTTC 


CATCTGCAGC 


CTCAAAACAC 


TGTTTTGAGC 


TGACTTCGTG 


10980 


AGTCTTATCT 


AAAACCTCAA 


GGCAGTACTT 


TGAGCAACCT 


GCGACTAGCT 


TTCTAGTTTA 


11040 


CTCTTTGATT 


TTCATTGAGT 


ATTAATTAGG 


AAAGAATGTT 


ATGCAACTTT 


TTTAAAAAGG 


11100 


CTTGCGTTTT 


TGCCTCAATA 


TCTTCTGCTT 


GCATCAAATC 


ACGTACAACA 


GCTACACCAG 


11160 


CTATGCCAGT 


GCCCATAAGC 


TGATCAATAT 


TCTCCGAAGT 


CAAGCCTCCA 


ATAGCAACTA 


11220 


CTGGAATGGC 


AACCGTTTGG 


CAAATTGTTT 


TCAAGGTCGA 


TATCAGAGTA 


ATGGGCGCAT 


11280 


TTTCCTTGGT 


GGTGGTTGGG 


AAAATGGCTC 


CTGTACCCAA 


GTAATCTGCA 


CCTGATTTGT 


11340 


CCGCTTCCAG 


AGCTCTTTTA 


ACCGTTTTAG 


CGGTGACACC 


GAGGATTTTT 


TCAGGACCCA 


11400 


AGACTTTGCG 


AGCTACCGAA 


ACTGGTAATT 


CATCATCTCC 


GATATGCAGA 


CCTGCTGCAT 


11460 


CAACCGCAAG 


ACAAACATCC 


AACCGATCAT 


CGATTATCAA 


GGGTACCTGA 


TAAGCATCTG 


11520 


TTATTTCCTT 


GACTTGTTTT 


GCCAGTTGAT 


AATATTGATT 


GGTTGTGAGA 


TTTTTTTCTC 


11580 


GCAATTGGAC 


TATGGTAACC 


CCTGAACGGC 


AGGCCGTCTC 


AACTTTTGCA 


AGAAAGCTTT 


11640 


CCACGGAATC 


TTGATAGCGA 


TTGGTTACCA 


GATATAGTCT 


AAGTGCTTCT 


CTATTCATAA 


11700 


ACCTCTCCTT 


TGATGGTATC 


TAGCCAATTT 


TCATCTCTTC 


TTAGGAGCGA 


AAGCTGATTG 


11760 


AGTACTTGGT 


AACGAAATTC 


TTCCAATCCC 


ATTCCTTGAA 


CAACTATTTT 


CTCAGCAGCG 


11820 


ATATTGAGAT 


AAGAGACTGC 


TAAGCAAGAA 


GCTTCAAAAC 


CAGTCTTTCC 


TTGGCTGAGA 


11880 


AAAACAGCTG 


TTAAGGCTCC 


AACCAAGTCT 


CCTGTCCCTG 


TTATCCAGTC 


TAATTCAGTA 


11940 


CAGCCATTTC 


CCAGTACAGC 


GACCTGATTT 


TTCGAAACGA 


CGAGGTCCTT 


GGGACCTGTG 


12000 


ACTAAGAAAG 


ACATACCAGG 


ATAGGTCTGA 


CACCAGTCTT 


TCAAGACTTG 


AAGCAAATCC 


12060 


TCCGTTTCTT 


GATCTTTAGC 


ACTCGCATCG 


ACCCCAACGG 


CGTGGTGCTT 


TAATGCAACA 


12120 


AGACTTCGAA TTTCTGACAT 


GTTTCCTTTA AGGACCGTAG 


GTCTATAGTC 


TAAAAGGTGT 


12180 
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TTAACTAAGC TCTTACGAAT GGATGAAGTC GTTACGCCAA CCGCATCTAC TACCATCGGG 12240 

AGAGAAGATT GGTTTGCATA CGAAGCTGCC ATGCGGATTG CTTTTTCCTT CTCAGCTGAC 12300 

AAATGCCCCA AATTGATGAA GAGAGCCTGA CTTTGCTTAG TAAAATCAAG AACTTCACGG 12360 

GAATCATCTG CCATGACAGG TTTGCATCCC AGAGCCAAAA TCCCATTTGC CAGCATCTCA 12420 

CAAGAAATCT CATTGGTAAT GCAGTGAATG AGGGAACTAG AGCCTATAGG AAAGGGATTT 12480 

GTAAATTCCT GCATCAGTCT ATCCTTTCAC TAAAGAAATA TCCCTGCACT TTTTTAAAGA 12540 

ATTCCTGCTT GATTAAAAAT CGAAAGGCAA TAAAGGAAAT CGCTGTACCA ATCAAGGTTG 12600 

CTCCGAAAAA TCGAGGCGTG TAGATAAACC AGCTAAGCTT AGCAGCTGAT CCTGTAAAGA 12660 

GTACCATAAC AGGATAGGAA ACAATGGAAC CAATAATACC TGTTCCCAAA ATCTCTCCTA 12720 

GAGCAGAATA GTGAAATTTT CGACCGTACT TATAAAAGAG ACCTGCTAGA AGGGCTCCAA 12780 

AAGTCGCTCC TGTGAGAGCT AAAGGCGGAA TCCCTTGAGT CGTCATACGG ATAAAGGCTG 12840 

TGACTGTAGC CATAGCCAAG GCATAAACAG GTCCCATCAT GATTCCTGCT AGAATATTGA 12900 

CTACACTGGA CATCGGTGCC ATTCCCTCAA TTCGAAAGAT AGGTGTAAGG ACTACATCAA 12960 

GGGCAATCAT CATAGATAAA ATGGTTAATT TGTGAACTTG TAATTGGTGC TTTCTCATGC 13020 

TTCTATTCTT CTCCTTTTTC TAAAGACTGT AAATCGCTCT TCCATGTCTG GTGTTGGTAG 13080 

GCCATTTCCC AAAACTTGGC TTCCATATGA ACACTGATGT GGAAGGCATC TAGCATTTTT 13140 

TGCTTGTCTG TCTCGTCACT TTCTCGATAG AGCTGATTGA CCAGTGCTCC CTCCTCTCTG 13200 

ATCTGTTGCT CTAACTCATC CGTAATATAA GTTTCAATCC ATTGTTGATA GAGAGGATTT 13260 

GGTGATGGTT TAAGATTAAG TGATTTGCCT ATATCATGGT ATAACCAAGG ACAAGGAAGC 13320 

AAGCTTGCAA AAGCGATGGC TAAGTTCGGT TCTGCAAATT GCCTATAAAT ATGAGAAATG 13380 

TAATGATAAC AGGTTGGAGC GATTGGATGT TGCTCCATTT CCTGGTCGCT GATTTCCAAT 13440 

TCCTTGAAAA ATTGTTGGCG AATAAATAAC TCACCCTCCA CTAAACCCTG AGCATTTTGT 13500 

TTCAAGAGTC TTTTCATCTC TTGGTTTGAA GTCTTATCAG CCAAAAGATG ATAGATTTCT 13560 

GAGAAAGCCT TCAGATAGTA GGCATCCTGA ATCAGGTAAT AGCGGAAAAT GGCAGGTTCT 13620 

AAATTCCCCT CTTGTAATTG TAAAATAAAG GGATGATGAA AGGAAGCCTG CCAAGCTTTC 13680 

TTGGATAATT CCATCGCAAT ATCTGTAAAT TCCATAATAA CTCCTTTATA AAAATAGACT 13740 

GGTTTGAAGC AATAAAAAGA AAAGCAGGTA GATTAATTTT GTTTTTTTAG GAATATAAAA 13800 

AGTCCGATAG CTATTCTTCA ACTGTGCATG TTCGTCATAT CCGTGAGCAG ATAGAGCTCT 13860 

CAGGTAAAGA TGGCGCCACC TAAAGACTGT CATCAGAACC TTACTGTAAA TCAAGGGCGA 13920 

CCAAAAATGT AGTTCTTGAC CACGTAATAG GCAAGCTTCT TTGAGGGACT TGATTTCTTG 13980 
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CTGAATGAGA 


GGAAAAGAAT 


TGAATACCAC 


AATCAAGGCA 


TAGGACCAAG 


AGCGTGATAG 


14040 


CCCCTTTTGA GCCAAGTACA AGAGAAGCTC TTTTAGTGAA ACAGAGGAAA 


CAAAGACAAG 


14100 


GCCGATACAA 


ACTGTCACAA 


AGGCCCTCGT 


TCCAAGCATG 


ACTGCCTGTG 


AAGCATCTCC 


14160 


GTGTAACTGA ACTGCCCAGT 


AGTTGGCAAA 


AGATGGTAAA 


ATGGCAAGTA 


TGATCATCGA 


14220 


AGCTAACATT 


TTAAATCGAC 


GGTAATAGAG 


CATAAAGAGA 


ATACAAAATG 


CGACTACCGA 


14280 


AAGAGTCAGA 


GCAATCGAAG 


GAATGAAAGA 


TGTTTCCAAG 


GATAAAATCA 


GCAAGAAGAG 


14340 


ACTGATAATC 


GGTGTCTGGG 


TTGCTACTTT 


GACCATACTA 


TCTCACCTCC 


CCTTGGGTAT 


14400 


TGCTACTCTG 


AGATGTAAGT 


GGTTTGGTAA 


TGGTCACTTC 


TTTCACATGC 


CGAAGACCCT 


14460 


GACTAGTCAT 


CTCAATCCAA 


TAATCAACCA 


CAGAAATCAA 


AGGGTCTAAA 


CGATGACTAA 


14520 


TGAGCAGAAA 


ACTTCTTCCT 


TGATTCCTCT 


CCTCCACAAT 


CCACTTGCAA 


AAATAATGGC 


14580 


AGGCTCTATC 


ATCCAAACCT 


GCAAAAGGTT 


CATCTAGCAA 


GATCACGGAA 


GCCTTACTGG 


14640 


TCAAGATGGT 


CAGGAGCTGA 


AGAATTTTTT 


GCTGACCACC 


ACTTAATTGA 


TAGGGACTCT 


14700 


TATCGACTGC 


CTGCTCCAAA 


TCAAAATATC 


GTAAAGCTTG 


AAAAATCCGC 


TGATTTCTTT 


14760 


CAGAATCAGG 


TCCATCTAAT 


TGAAGCTCCT 


CTCGCAGACT 


GACTCGGATA 


AACTGCTTCT 


14820 


CAGCTTCCTG AACAACACCA 


GTCAGATCAC 


GATACAAACT 


CTTTTTCTTT 


TTCAGGACCG 


14880 


AACCCTTCCA 


AGTAATGCTC 


CCCTTATACT 


TTTGAAATTG 


AAGAATAGAC 


CGAAAGAGGG 


14940 


TTGATTTCCC 


GACACCATTG 


TCACCCAGGA 


TACAGGAAAT 


CCCTTGATAG 


AATGTGAAAT 


15000 


CAGCAATTGA 


AAAGAGGGGG 


CGATTACCAA 


GCTCACCAGT 


CACACGGTTC 


ATATGGAATA 


15060 


GTTCCGGGCT 


AGAAGCAACT 


TCCTTTGAAG 


CAACCTGTGT 


CATCTCATAG 


GAAGGGATTT 


15120 


GAAACACTTC 


CCTTAGTTTT 


CCGTCTCTTA 


GCTCCACCAT 


ATGGTCGATA 


TAGGCTTTAT 


15180 


AGTCAGATAA 


ATCATGGTCG 


CACAAAATAA 


CTGTCTTCCC 


ATCATAGACC 


AACTCTTTTA 


15240 


GAATCTCCAA 


TATCTCGATT 


CTGCTCTTGC 


GGTCAATGGA 


AGCGAAGGGC 


TCATCCAAGA 


15300 


GATAGACCCT 


AGGATTCATG 


GCAAAGAGGA 


CAGCCAGCGG 


TGCTTTTTGC 


TTTTCCCCAG 


15360 


CTGATAAGTG 


ATGGATGAGA 


CGGTGCAAGA 


TGTCCTTGCA 


ACGACATTGC 


TGGACAACCT 


15420 


CTGCTATTTT 


AGAATCAATT 


TCCTGAAGGT 


GATAGCCGAT 


ATTTTCCATG 


GTAAAAAGCA 


15480 


ACTCCTCAAA 


CAAGCTCTCC 


ATGGTAAATT 


GATGATTAGG 


ATTTTGCAAG 


AGAATACCAA 


15540 


CCGTCTGGAC 


ACGTTCGACG 


ATAGAAAGCT 


GACTGACCTC 


GCTCCCATCT 


ATCAGGACTT 


15600 


GACCGCTATA 


GGGAAGAGAA 


CTAACTTGGG 


CAATCATTTG 


AAAGAGGCTG 


GATTTTCGAG 


15660 


ACCCACTACT 


CCCAACTAAC 


AAGGTAAAGG 


CTTGCGCATG 


AAAAGTAAAA 


TCAAACGGCT 


15720 
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CAGAGAAGAT TGGGGACTGA ATCGCTCGTA GTTCCAGACC CATCTATGCT TTTCCTCCAG 15780 

TTGCAAACTG ATGATAGAGT TTGACAATGG CACGAACCAA GATGGTACAG AAGAAATAAA 15840 

CAGAAATAAA ACGTACCACA AGCAAGGAAA GGACAAACGG AAGGGAAAAG GCGTAGTAAC 15900 

CTAACTTAAT GTATTCATAG ACAAAGCTAA CAAGCGTAAT CCCAATACTA TTAGCAGTTA 15960 

GAGAGAGCCA ACTTTCATAG CGATTCTTAG TTACGATAAA ACCAAATTCA CTTCCCAAAC 16020 

CTTGAACAAA GCCAGACAAA AGAGCTCCTA GACCAAATTG GCTACCATAA AGGACTTCAG 16080 

CAAGCGCAGC TAGCACTTCT CCAATCGTTG CACTTCCGAC TCTCGGAACA AAGATGGCAG 16140 

-CAATGGGCGC AGCCATACAC CAGAGACCGA AGAGGATTTC ATTGGCAAAG GCCTGCAAAC 16200 

CAAGAGGTGT TAAGAGTAGA CTGAGAATAT TATACACATA TCCTGAACCA ACGAAAACCC 16260 

CACCAAAAAA GATAGACAAG AAAGCAAGCA AGATAACATC TTTTAACTGC CATTTTTTCA 16320 

ACATAAAAAA CTCCTTTTTT TAAAGAAAAG TGAGGCACTC AAGAAGACCG ACCTAAATAC 16380 

TTTGTATAGC AGACTGAATT TAGAACAGTA CACAAGAACA CTAAAATATT TCTAGAAATT 16440 

AATTTGAATT TTCTAATTGA TTTGTTCGCA TCTTATTTCA ATCTACTATA TCATCTTCAT 16500 

CCAGTTTCGT AAAAGAAAAA ACTCTAATTA CAGATACAAA TTAGAGTTCA GCTTACAAGA 16560 

TTAGACAGTT CTTTTCGACA TACGAAAAAA ACATTTCACA TTTCCCTTCG CCAGTCTTAA 16620 

CTGTATCAGG TTCAATGGGT ATCATCTCAG CCTAAAGCAC CCCAAATGTC TTTATTATTT 16680 

AATTATGTGA TTATTATAAC ACACATTTTA TACTAGTTCA AGAAATTGAA CTGGAAATAC 16740 

AGCCTTGCAC TCACAAAGAC AGCAGATCTT TCTTTTGCAA AAAACAAATG ACCTGTTTGA 16800 

TGAATTAGCC ATTCAAGCTG AATCTGGACA TAGCTTTTTA AAAAAGGAAA ATCCTACTTA 16860 

CTTAGAATCC AAGGATAGAT ATCTATTGTT CACTCATTTC CCGAACAGTT TTTTCTATAT 16920 

TTTTTGCATA CGATATTGCC GAAATGATTG AAACGCCATC CATATTGGTC TTTATAATGT 16980 

CTTTAATATG TTTCGTCTGT ATCCCACCAA TTGCAACTAA AGGCATTTGT GGCAATAGTT 17040 

TTCTCATCAA TTCAAGACCT TCATAACCTA TAGTACCACC AGCATCATCC TTTGACTGGG 17100 

TACCAAATAC AGGCCCAACA CCTACATAAT CTACATATTC AACTTTTGAT TGTTGAAATT 17160 

CTTCTTCGTT TCTTATAGAA AGACCAATTA TTTTATCTGG CATCAATTTT CTAATTTCAT 17220 

CAACACCAAT ATCATCTTGA CCTACATGTA CGCCATCGGC GTCAATTTCC ATTGCTAAAT 17280 

CTATATCGTC ATTAACGATA AATGGAACAT TGTATTTTTT ACAAAGTTCT TTAATTTGGA 17340 

TAGCTAGCTC AAGTTTTTCT AAGCCTTCTA AAGCACCCTC ACCTTTTTCT CGAAATTGAA 17400 

ATAAGGTTAT ACCACCTTTT AAGGCTTCCT CAACGACTGT ATATAGATTT TTTCCTTGGC 17460 

AAGTAGTCGT TCCACAAATA AAATATAGTT TTAGTAATTC TTTATGAAAC ATCTTACTTC 17520 
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ACTCTTTTGA 


ATTCCTTTAC 


ATCTTCATCT 


GTAATCTCGT 


ATAAGGCATT 


TATAAATTCA 


17580 


ACTTTAAATG 


TCCCAGGAAG 


ATGTCCATTT 


GGACGTTTTT 


CTGCTATTTC 


TCCAGCGATA 


17640 


TTGTAAACCA 


ACACTGCTGT 


TTTTAATGAT 


TTCAATTCTT 


GACCTTTTTC 


TAGTCCGATA 


17700 


AAGCTTGCTA 


CTACAGCTCC 


TAATAAGCAT 


CCTGTCCCAA 


TGACTTTCGG 


CATCATAGCA 


17760 


CTACCATTAT 


GAATCATTAC 


CACTTCTCCA 


TTAACAGCAA 


TGGCATCCAC 


TTCACCTGTT 


17820 


ACTACTATTG 


GAATATTGAA 


CTTCTCATTT 


GCTGCTAGAG 


CAATTTCGTC 


AATATTATCT 


17880 


ACGCCCGCAC 


TATCTACTCC 


TTTAGATGCC 


ACATCTATTC 


CTACTAAAGA 


GGCAATCTCG 


17940 


CCAGCATTTC 


CTCTAATCGC 


TGCTAGTTTA 


TAATTGTTGA 


TTAGATCATC 


TGCTACTTTT 


18000 


TTTCTATATT 


CTCCTGCTCC 


ACAGGCTACA 


GGATCTAAAA 


CTGCTGGGAC 


ATTATATTTC 


18060 


TCTGCAATTT 


TCAGAGCAGC 


TTGGTATAAT 


TTCCAATTTT 


CATCTGTCAA 


TGTTCCTATG 


18120 


TTTATTAATA 


AACCACCAGC 


ATACTTTAAC 


AAATCCTCTA 


AATCTGCTGG 


AAACTCACTC 


18180 


ATGGCTGGTG 


AGGCGCCCAG 


TGCTACTAAT 


CCATTTGCTG 


TGAAATTTTT 


TACTACATCA 


18240 


TTGGTTATAC 


AAATGACCAA 


TGGTGCTTTT 


TCTTTTAATA 


ATTTTAAACT 


TGTCATATTG 


18300 


AAATCCTTCC 


TTTTCACTTT 


ATACGATCTA 


CTAATTTCGA 


TTTATCTTTA 


GTTGAGAATT 


18360 


TTTTTCATTT 


ACATTGAATG 


ATTTATACTC 


AATGAAAATC 


AAAGAGCAAA 


CTAGGAGGCT 


18420 


AACCGCAGGT 


TGCTCAAAAC 


ACTGTTTTGA 


GGTTGTGGAT 


AGAACTGACG 


TGGTTTGAAG 


18480 


AGATTTTCGA 


AGAGTCTTAC 


CTCATCAAAT 


TTGTAAATAT 


CATGAGCCTT 


CTCTAGACAT 


18540 


CGTAACCAAT 


ATCAAAAAAA 


GCTAATTCTA 


AAGCGACTGC 


TTGATTCCAG 


CGTTGCTGAA 


18600 


GTTCTGTCAA 


ATCTTCTCGA 


TTTTTACCGA 


CACGATTGAG 


TTCGTCAACC 


AGAAATTGAA 


18660 


CCCACTCTGC 


AAAGAAAGGA 


CCTCTGTGGA 


GATTGATCCA 


TTCCGAATGA 


ATATAGACTT 


18720 


CAGGTAAAGC 


CAAATCTTTA 


GAACCCCAGT 


CTAAATAGAG 


ACCTTCTGCA 


ATGACCAGCA 


18780 


TGACCAAAAG ATGGGCATAG 


TCTGATGAAG 


CCACCGCCGA ATACATTAGA 


TCCTGAAAGG 


18840 


CTTTTGTTAC AGGGTGCAAA GTCACTTCTA GATAGTCATT CTCTGCTACT 


TTTAACTCTT 


18900 


TAAAAGCCTT 


TTGGAAATAA 


CCATCTTCAT 


CTGCTTCAAG 


AAAGCCTAGT 


TGCTTGGCAA 


18960 


AACGAAGCTT 


GGATTCAAGT 


TTATCTGCGT 


GACTACGCAG 


GCACCCAGCA 


TGGATAAGAA 


19020 


GGCATCAAAG AAGTGATAAT 


CTTGAATCAG 


ATAGTCCTTT 


AAGACCTTAT 


TCTCAATTGT 


19080 


CCCCGCAAAA AGTTCCTTAA CAAAACGATG ATTGATTGCA GCCTGCCAAT 


CCTTCTGACT 


19140 


GCTTTTTAAT 


AATTCTCCAA 


CAGTCAAACC 


TGGCTGAAAT 


GCATAGTCTT 


GTGTTTCCAT " 


19200 


ATTTACTTCT CCTCTCTTTA CTTGTTAGTA ATTAATAAAA CACCAAGAAA 


TATCAAGCAA 


19260 
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AATCGTAATT CCACTTGATC CTTTTAAAGC ACATCGAGAG CATTTGCAGA GAGCTAACTA 19320 

AACAAGCCTA TCCAGTTTAT ATAAACAAAA AACTCCAATT ACAATCAAGA ATTAGAGTTG 19380 

ACTTACAAGA TTAGACCGTT CATTTCACCA TACGAAAAAA CTGTTCACAT TTCCCTTCGC 19440 

CAGTCTTAAC TGTATCAGGT TCAATGGGTA TTATCTCAGC CTAAAGCACC CCAAATGTCT 19500 

CTATTATTTA ACTACTGAAC CAGTATAGCA AAAAATGAAA GCCCTAGCAA GATATTTGAC 19560 

CGAAAAATAT CTTTATATAT AATATATTGA AACTAGAATA GTACACCTCT ACTTATAAAA 19620 

CATTGTTAGA AATCGATTTG ACTGTCCTGA TTGATTTGTC CTATTCTTAT TTCATTTTAC 19680 

TATAGTTTTC GATAGCAATT TATTCTTCCA ATACACGAAG AAAAACCTCC ACATTCAGTG 19740 

GAGGCAATCT GTTTTATCAA TACAATTTTA AGTCACGAGG GTCAACTGGG AAGGTTGGGT 19800 

TGTATGGATT GTGACGGAGC TTGAAGTGTT TGACATCTTC AATGGTCTGA GTTCCAGACA 19860 

ATTGCATAAC TGTCTTCAAT TCCGCATTCA AGTGTTCAAA GACTTGACGC ACACCGACAC 19920 

TACCACCGAG AGCCAAGCCA TAGATGACAG GGCGTCCAAT AGCAACCAAG TCTGCTCCTG 19980 

ATGCCAAGGC TTTAAAGACG TGTTGACCAC GACGAACACC AGAGTCAAAG ACAATCGGCA 20040 

CACGTCTATC AACTGCTTCT GCCACTTCTT GAAGCGAGTC AAAGGCAGCT GGTCCACCGT 20100 

CGATTTGACG ACCACCGTGG TTGGTTACCC AGATACCAGA AGCTCCTGCA GCAAGCGAAC 20160 

GTTCAACGTC CTCACGGCAT TGTGGTCCCT TGACATACAC AGGAAGACCA GAGTATTCAG 20220 

CGATAAATTC TACATCGCGT GGAGACAAGC GTTGTTTAGC TGATTTGTAA ACAAAGTCCA 20280 

TTGATTTACC AGCACCTTCT GGCAGGTATT CTTCAACAAT CGGCATGCCA ACTGGGAAGA 20340 

CAAAACCATT ACGCTTATCC ACTTCACGAT TCCCCCCTAC AGTAGCATCT GCCGTCAAGA 20400 

CAATCGCTTT ATAACCTTCA GCCTTCACAC GGTCCATGAT GTGGCGGTTG ATACCGTCAT 20460 

CCT TACT AAA GTAAAATTGA AACCAATGAG GTGTCCCTTG GAGGGCTTCA GAAATCTCTG 20520 

GAAGGTCAAC AGTAGAGTAA GAACTGGTTG TATAAAGAGA ACCAAACTCA TGCACACCAC 20580 

GCGCAGTCGC CACTTCCCCC TGTTCATTTG CCAATTTATG AGCCGCAACA GGTGCCATAA 20640 

TGATTGGAGA AGATAGTTTT TCACCTGCAA ATTCAATCTC TGTACTTGGA TTTTCTACAT 20700 

TGCAAAGTGT ATGAGGAACG ATGAGCTTGT GGTTAAAGGC ACGGATATTC TCTCTTAAAG 20760 

TGAAAGTATC TTCCGCCCCA CTAGCGATAT AGCCAAATGC TGCTTTAGGA ATAACTTGTT 20820 

GCGCCATTGG CTCCAAATCA TAGGTATTGA TGAArTCTAC ATGACCTTCT GCATTGCTTG 20880 

TTTTGTATGA CATAAAATGT CCTCCTTAAT AAGTAAGCGT TTACTTTGTG TATTACAAAA 20940 

ATATCTTAAC TCTTTTTCAA AACTTTTAAA ATATTTTGTT TGGAAATTTC AGAAATTTTA 21000 

TGTCTATGAT AAAAATCCTT ATAACGGCAA TAAAAAATAG ATATTATCCA AAGAAGATTT 21060 
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TAAGTGCTAC AATAACTGTA TTATTTCTAG ATGGGAGGTT CTATTTTTGG 


ATTGATCCAT 


21120 


TGTTGAACAA TATCTACCAC TATATCAAAA GGCATTCTTT CTGACCTTGC ATATTGCAGT 


21180 


TTGGGGAATT TTGGGATCCT TTCTGCTCGG TTTAATCGTT AGTATCATCC 


GACATTATCG 


21240 


AATCCTTGTT TTGGCGCAAG TAGCGACAGC CTACATTGAA TTGTCACGTA 


ATACGCCCCT 


21300 


TTTGATTCAA CTCTTCTTTC TCTACTTCGG TCTTCCCCGA ATCGGGATTG TCCTATCTTC 


21360 


AGAAGTCTGT GCAACGCTTG GGCTTGTCTT TTTAGGAGGC TCCTATATGG 


CAGAATCTTT 


21420 


CCGAAGTGGG CTGGAAGCCA TCAGTCAAAC CCAGCAGGAG ATTGGCCTCG 


CTATTGGTCT 


21480 


GACACCTCTA CAGGTCTTTT ACTATGTGGT TCTTCCGCAA GCAACAGCGG 


TGGCACTCCC 


21540 


CTCCTTTAGT GCCAATGTCA TTTTCCTTAT CAAGGAAACC TCTGTTTTCT 


CAGCAGTGGG 


21600 


TTTGGCCGAC CTCATGTACG TCGCCAAGGA . TTTGATTGGT CTCTACTATG 


AGACAGACAT 


21660 


TGCGCTAGCT ATGTTGGTAG TTGCTTATCT AATCATGCTG CTACCCATCT 


CACTGGTCTT 


21720 


TAGCTGGATA GAAAGGAGGC TCCGCCATGC AGGATTCGGG AATCCAAGTA 


CTCTTTCAAG 


21780 


GAAATAATCT CCTGAGAATC TTACAGGGAT TCGGCGTTAC GATTGGGATA 


TCCATCCTGT 


21840 


CTGTCCTCTT ATCCATGATG TTCAGAACAG TCATGGGAAT CATCATGACC 


TCCCATTCTA 


21900 


GAATCATACG ATTTTTAACA CGATTGTATC TGGAATTTAT CCGTATCATG 


CCCCAGCTGG 


21960 


TGCTACTCTT CATCGTTTAC TTTGGCTTGG CTCGAAACTT TAATATCAAT 


ATCTCAGGTG 


22020 


AGACTTCAGC TATTATCGTT TTTACCCTCT GGGGAACAGC TGAAATGGGA 


GACTTGGTAC 


22080 


GTGGAGCTAT CACTTCTCTC CCTAAACATC AGTTTGAAAG TGGACAGGCA 


CTCGGCTTGA 


22140 


CTAATGTTCA ACTTTACTAC CACATCATGA TCCCACAAGT CTTAAGAAGA 


CTGCTACCGC 


22200 


AGGCTATCAA TCTTGTCACT CGGATGATTA AAACCACTTC ATTAGTTGTT 


TTGATTGGGG 


22260 


TTGTGGAAGT GACCAAAGTT GGACAACAAA TCATCGATAG CAATCGCCTG 


ACCATCCCAA 


22320 


CTGCTTCATT TTGGATTTAT GGAACCATTC TAATCTTATA TTTCGCAGTT 


TGCTAGCCTA 


22380 


TTTCCAAACT ATCCACTCAC TTAGAAAAAC ATTGGAGAAA CTAAATGTGT 


GAAACTATCT 


22440 


TAGAAATCAA GGAACTAAAA AAATCCTTCG GAGACAATCC CATCCTCCAA 


GGACTTTCTC 


22500 


TAGAAATCAA AAAAGGGGAA GTTGTTGTCA TCCTAGGGCC ATCTGGTTGT GGGAAAAGTA 


22560 


CCCTCCTTCG TTGCCTCAAC GGCTTAGAAA GTATTCAAGG TGGAGATATT 


CTTCTGGATG 


22620 


GTCAGTCTAT CGTTGAAAAT AAAAAAGATT TTCACCTAGT TCGCCAAAAG ATTGGCATGG 


22680 


TCTTTCAAAG TTATGAACTC TTTCCCCATC TGGATGTCTT ACAAAACGTC 


ATCCTAGGCC 


22740 


CTATCAAAGC TCAAGGAAGG GACAAGAAAG AAGTAACGGA AGAAGCTTTG CAATTACTAG 


22800 
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AGCGTGTCGG TTTGCTGGAT AAACAACATA GCTTTGCCCG TCAATTATCT GGTGGACAGA 22860 
. AGCAACGTGT TGCAATTGTC CGTGCCCTCC TAATGCATCC AGAAATCATC CTTTTTGACG 22920 
AGGTGACTGC TTCGCTGGAT CCAGAAATGG TGCGTGAGGT GCTGGAACTT ATCAATGATT 22980 
TGGCCCAAGA AGGCCGTACC ATGATTTTAG TAACCCACGA AATGCAGTTT GCCCAAGCCA 23040 
TTACTGACCG GATTATCTTC CTCGACCAAG GGAAAATCGC TGAAGAAGGA ACAGCTCAAG 23100 
CCTTCTTTAC CAATCCGCAA ACCAAACGAG CCCAGGAATT TTTAAACGTC TTTGACTTTA 23160 
GCCAATTCGG CTCATATCTA TAAAGGAGAT TCTTATGAAA CTATTCAAAC CACTCTTAAC 23220 
TGTTTTAGCA CTTGCCTTTG CCCTTATCTT TATCACTGCT TGTAGCTCAG GTGGAAACGC 23280 
TGGTTCATCC TCTGGAAAAA CAACTGCCAA AGCTGGCACT ATCGATGAAA TCAAAAAAAG 23340 
CGGTGAACTG CGAATCGCCG TGTTTGGAGA TAAAAAACCG TTTGGCTACG TTGACAATGA 23400 
TGGTTCTTAC CAAGGCTACG CTACGATATT GAACTAGGGA ACCAACTAGC TCAAGACCTT 23460 

GGTGTCAAGG TTAAATACAT TTCAGTCGAT GCTGCCAACC GTGCGGAATA CTTGATTTCA 23520 

AACAAGGTAG ATATTACTCT TGCTAACTTT ACAGTAACTG ACGAACGTAA GAAACAAGTT 23580 

GATTTTGCCC TTCCATATAT GAAAGTTTCT CTGGGTGTCG TATCACCTAA GACTGGTCTG 23640 

ATTACAGACG TCAAACAACT TGAAGGTAAA ACCTTAATTG TCACAAAAGG AACGACTGCT 23700 

GAGACTTATT TTGAAAAGAA TCATCCAGAA ATCAAACTCC AAAAATACGA CCAATACAGT 23760 

GACTCTTACC AAGCTCTTCT TGACGGACGT GGAGATGCCT TTTCAACTGA CAATACGGAA 23820 

GTTCTAGCTT GGGCGCTTGA AAATAAAGGA TTTGAAGTAG GAATTACTTC CCTCGGTGAT 23880 

CCCGATACCA TTGCGGCAGC AGTTCAAAAA GGCAACCAAG AATTGCTAGA CTTCATCAAT 23940 

AAAGATATTG AAAAATTAGG CAAGGAAAAC TTCTTCCACA AGGCCTATGA AAAGACACTT 24000 

CACCCAACCT ACGGTGACGC TGCTAAAGCA GATGACCTGG TTGTTGAAGG TGGAAAAGTT 24060 

GATTAGTCAT TAACTCTTAA AAGGAACTGG ATTTTAAGCT CCAATCCCTT TTTAAGATTT 24120 

TACCTATAAC ATCCTGAGTC TATCTAAGAT GTTCAATCTG AACACAGTGT ACATACTTTA 24180 

TCTTCTATTG CATATACTTT ATCACATAAG ATACGAATAT CCTCTTCACT ATGACTAGCA 24240 

ATCAAAATTG TTGTCCCTTT TTCACTAGAG AGCTTTCTAA ACAATGTTCT CATATTTTCT 24300 

ACACTTGATT TATCCAAGGC ATTCATAGGT TCATCTAGTA AAAGAATAGA GGGATTCTCC 24360 

ATAATTGCTT GAGCAATCCC TAGCTTTTTC CTCATACCTA GCGAATAAGT TTTAACTTTC 24420 

TGGTCTTTTT GCTCATATAG ACCAACTATT TTCAGTGTAT CATTGATTTC CTGATTACCA 24480 

ACTACTCCTC GTATGCTTGC CAAATATTGT AAATTCTTAA AGCCACTATA ATAATTTATA 24540 

AAACCAGGTT CTTCAATCAA AGCTCCCAAA TTAGCTGGAA TTTTTCTCTC AGGAACAATA 24600 
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TTTTCCCCAT TGATTAACAC TTCTCCATAA GACGGACTAT ATAAACCAGC TATTAATTTA 24660 

AACAATACAC TTTTCCCTGA GCCATTCGCA CCAGTAATTC CTATAATTTC CCCCTGTTTA 24720 

CAACTAAAGT TAAGGTTTTG AAAAACACAT GTCTTTTTTA ATTTCAACTC AATATTTTTT 24780 

AATGTAATTA TTTCATTCAT TCTATAAACC TCCTCTTTTG ACGAGTGAAA TAGAAAATGC 24840 

TTTGAAAAAG AAAGACTAAA AATAGCAACT GAAGAAATAA ATCTCGTCCT ATATCTCCAT 24900 

TCCCTCGATT CAAAATATAA AATAGATAAT TAGTTCGATT TCCTACAAAT AGACCACCAA 24960 

ACACAATCAT GAGTAAAAAG AAACTAACGC AAGCAAAGTT CG 25002 
(2) INFORMATION FOR SEQ ID NO: 49: 

<i) SEQUENCE CHARACTERISTIGS: 

(A) LENGTH: 11443 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



CAGGTACGGT 


GAGGCGCAAC 


TAAAATATAA 


TTTTCATCTT 


GATTAGGAAT 


TTTATCAGTA 


60 


TTATGATAGT 


GAGCATTGCC ATTGATGGAC CATAAGAGCA ATACAACTAA TCCACGCAAA 


120 


TAAGTATAAA 


ACATGCGATC 


TCCTTCGATT 


GTTTTCTTGT 


TATTATTATA 


CCTTATCAAA 


180 


GGAGGGCTGG 


CAAACTTTTC 


CCTTGACTAG 


ATACATATTT 


AGGATGAAAT 


TAGAATTCTG 


240 


TTAAAAAAAA 


TGATATAATA GAATTTATGG ATAAAAATAA GATTATGGGA TTAACCCAAA 


300 


GAGAAGTCAA 


GGAAAGACAG 


GCTGAGGGTT 


TGGTCAATGA CTTTACCGCA 


TCAGCCAGTA 


360 


CCAGCACTTG 


GCAAATCGTT 


AAACGAAATG 


TCTTTACCCT 


TTTTAACGCT 


TTGAACTTTG 


420 


CCATTGCTTT 


GGCTCTTGCC 


TTTGTGCAGG 


CTTGGAGCAA 


TCTGGTCTTC 


TTTGCTGTTA 


480 


TCTGCTTTAA 


CGCTTTTTCT 


GGGATTGTGA 


CCGAGCTACG 


AGCCAAACAC 


ATGGTGGACA 


540 


AGCTCAATCT 


CATGACCAAG 


GAAAAGGTCA 


AAACCATCCG 


TGATGGTCAG 


GAAGTTGCTC 


600 


TTAATCCTGA 


AGAATTAGTG 


CTAGGAGATG 


TCATTCGTTT 


GTCTGCAGGA 


GAGCAGATTC 


660 


CTAGTGATGC 


CTTGGTTTTG 


GAAGGCTTTG 


CGGAAGTCAA 


TGAAGCCATG 


TTAACGGGAG 


720 


AAAGTGATTT 


GGTGCAAAAG 


GAAGTTGACG 


GCTTACTTTT 


GTCAGGAAGT 


TTCCTAGCCA 


780 


GTGGGTCAGT 


TTTATCTCAA 


GTTCACCATG 


TCGGTGCAGA 


CAACTATGCT 


GCCAAACTCA 


840 


TGCTTGAGGC 


TAAGACCGTT 


AAACCCATCA 


ACTCCCGTAT 


CATGAAATCG 


CTGGACAAGT 


900 


TGGCTGGTTT 


TACTGGGAAG 


ATTATCATTC 


CCTTTGGTCT 


GGCTCTCTTG 


CTGGAAGCCT 


960 
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TGCTTTTAAA AGGCCTGCCT CTCAAGTCAT CCGTTGTAAA CTCGTCGACA GCTCTTTTGG 1020 

GAATGTTGCC TAAGGGAATT GCCCTTTTGA CCATTACTTC GCTCTTGACT GCAGTGATTA 1080 

AGTTGGGCTT GAAAAAGGTC TTGGTGCAGG AGATGTACTC TGTTGAGACC TTGGCGCGCG 1140 

TGGATATGCT CTGTCTGGAC AAGACGGGTA CCATCACCCA AGGAAAGATG CAGGTGGAGG 1200 

CTGTTCTTCC GTTGACGGAA ACGTATGGTG AAGAGGCTAT TGCCAGCATC TTGACTAGCT 1260 

ACATGGCCCA TAGTGAGGAT AAGAATCCAA CTGCCCAAGC CATTCGCCAG CGTTTTGTGG 1320 

GAGATGTTGC TTATCCTATG ATTTCCAATC TTCCCTTCTC GAGCGACCGC AAGTGGGGGG 1380 

CTATGGAGTT AGAAGGCTTG GGGACAGTTT TCTTAGGGGC ACCTGAGATG TTGCTTGATT 1440 

CTGAAGTCCC AGAAGCTAGG GAGGCCTTGG AGAGAGGATC ACGTGTCTTG GTCTTAGCTC 1500 

TCAGTCAGGA GAAATTAGAC CATCACAAAC CACAGAAACC ATCTGATATT CAGGCTCTAG 1560 

CCTTGCTGGA AATCTTGGAC CCCATTCGAG AGGGAGCAGC AGAGACGCTG GACTATCTCC 1620 

GTTCTCAGGA GGTGGGACTC AAGATTATCT CTGGTGACAA TCCAGTTACG GTGTCCAGCA 1680 

TTGCCCAGAA GGCTGGTTTT GCGGACTATC ACAGCTATGT AGATTGCTCA AAAATCACCG 1740 

ATGAGGAATT GATGGCCATG GCGGAGGAGA CAGCTATTTT CGGACGTGTT TCCCCTCATC 1800 

AAAAGAAACT CATCATCCAA ACGTTGAAAA AAGCGGGACA TACAACGGCT ATGACAGGGG I860 

ACGGGGTTAA TGATATCTTG GCCCTTCGTG AGGCGGATTG TTCTATCGTG ATGGCGGAGG 1920 

GGGATCCAGC AACCCGTCAG ATTGCCAATC TGGTTCTCTT GAACTCAGAC TTTAATGATG 1980 

TTCCTGAGAT TCTCTTCGAG GGTCGTCGCG TGGTCAATAA CATTGCCCAC ATCGCCCCGA 2040 

TTTTCTTGAT AAAGACCATC TATTCCTTCC TGTTAGCAGT CATCTGTATT GCCAGTGCTT 2100 

TACTAGGTCG GTCAGAGTGG ATTTTGATTT TCCCCTTCAT TCCGATCCAG ATTACCATGA 2160 

TTGACCAGTT TGTGGAAGGT TTCCCACCAT TCGTTCTGAC TTTTGAGCGA AATATCAAAC 2220 

CTGTTGAGCA GAATTTCCTC AGAAAATCCA TGCTTCGTGC CCTACCAAGC GCTCTCATGG 2280 

TCGTCTTCAG CGTCCTGTTT GTGAAAATGT TTGGCGCGAG TCAAGGTTGG TCTGAGTTAG 2340 

AAATCTCAAC TCTACTCTAT TATCTCTTGG GGTCAATTGG TTTCTTATCC GTATTTAGAG 2400 

CCTGCATGCC ATTTACCCTA TGGCGTGTCC TCTTGATTGT TTGGTCAGTA GGAGGTTTCC 2460 

TAGCCACAGC TCTCTTCCCA AGAATTCAAA AACTGCTTGA AATTTCAACC TTAACAGAAC 2520 

AAACGTTGCC TGTTTATGGT GTCATGATGT TGGTCTTTAC CGTGATTTTC ATCCTGACCA 2580 

GTCGTTACCA AGCGAAAAAA TAAATCAAAA CCACCAGTGT GAACTGGTGG TTTGTTCTGC 2640 

GGCTATAAGC CGCTTCTACC GGCCAGGGCC AAAGGCCCAC CGAAATAGCT TCCTCGCGCA 2700 

CCACTTTCCC GAGCAGGTGC TAAAGCACCT TAGTTACTTC CTCTTATTTA TTTCGCCAGT 2760 
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AAACGGATCT ACTGACTCGA ATAACGTGAG CTGGTCTGCT ACTCTGTCTT CTTGTAATTG 2820 

ATTCTGAATA TATTCAGCTA TCACTTTCTG ATTACGGCCT ACCGTATCTA CATAATAGCC 2880 

TCTACACCAA AACTTGCGAT TGCCATATTT GTATTTTAAA TTCGCATGCT TATCAAAAAT 2940 

CATCAAACTG CTCTTGCCCT TTAAATAGCC CATAAAGGAC GAAACACTAA GTTTCGGAGG 3000 

AATACTGATA AGCATGTGAA TATGGTCTGA ACAAGCATTC GCTTCATGGA TTATTACACC 3060 

GTTACGCTCA CATAAGTCAC GTATGATTCT TCCGATACTA GCTTTGTATC TGCCATAAAT 3120 

GATTTGACGA CGATATTTGG GTGCAAAAAC AATATGATAT. TTACAATTCC ATGTGGTATG 3180 

TGATAAACTT TGATTATCCT CTCTCATGAG GTACCTCCTG TATGATATGT TGTAGTGGCG 3240 

GAGAAACCAC TTCTATCTTA TCATTTTAGG AGGTTCTTTT TGTTACCACG CTAAAAGCTC 3300 

TATGGAACCA CTAGCATAGC TAGTGGTTTT CGGGAGACAA CAAGAAAGAC TGCAATCTGT 3360 

GGATTGCAGT TTTTTATACG ATGGATCTAT CGTAGATCTG ATGTGCAAGG CCTACGTGCC 3420 

GATCATCTAT CGGTGAACCC AAGAGCGACC CTCAAGCCTG CTTGGATTGA GGTAATAGAT 3480 

TCAAATATCT GTAGTTAGAC TATTTGAAGT TTGATGTAAG AAAGAGAAAG CGACAGATTG 3540 

AAGTAATTTT AACTCTCTTC TATTGCTAGA ACAAATGGTC GGATAGGTTG GTAGTTTGAA 3600 

AATGAAGATG CTATCTATTG TTAAATGGAA CATAGTGTTA TTTATTAGAA AATCGTTTGG 3660 

TTTATTTCTT ATCAAATACG AAAAGCAACT TAAATATTTC AACTAAAATA GATGTTATGA 3720 

AGAAAAGGTA AAATGATTTT GGCATAGTGA GGTTCTGTTC TATTTGATAT CATATTTTTG 3780 

ATAAAAACAA AAATGTCCAT TGCAAAGGAC AAAATGCGAA GTATATTATT TTTTGAAAGC 3840 

GATATAATGG ATTCATAAAG GAGGTGTATC GTGTCTAGAA AACAAGAACA AATGGAAACG 3900 

TTGTTGCTCC TTTTGCGAGA TAGTAAGGAT TATATATCTG CTAAAGTATT GGGAGAAAAA 3960 

TTAAATTGCT CTGATAAAAC GGTTTATCGC CTTGTCAAGG GAATCAACAA AGATTGTCCG 4020 

GTAGAAGCAT TCATTTTATC TGAAAAAGGC AGAGGTTTCA AATTAAATCC AAGAAGTTCC 4080 

CTCGTGGACG TTGATGGGAA TTTTACAGAG GCTTTTGATC CTGAAGTAAG GCGTGAAAAA 4140 

TTACTAGAAC GTCTCTTGTT GACTGCTCCT AAGCCACATT CTATTTATGA TTTAGGAGAG 4200 

GAATTCTACG TAAGCGAGTC AGTAGTACTA AAAGATCGTC AGATATTACA AGAGAGTCTA 4260 

GCAATTTATG GGTTAGATTT AAAAATGAGA CAACGAAAGC TTTTTATTGA TGGGGATGAG 4320 

GCTCAAATTC GTTCAGCCAT TCTAAATCTA CTGCCAATGT TTAATCAGTT GGATTTAGAG 4380 

CAAATTACAC AGAATAAGGT TCAGCCTCTT GACGGAGAAC TTGCTCACTT TTGTTTGGGA 4440 

TTACTGATTA CACTTGAGAG AGAATTGGGG GTAAACATTC CCTATCCATA TAATATAAAT 4500 
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ATTTTCTCTC 


ACCTGTATAT 


TTTTATCAGT 


AGGAATCGTC 


GTAGTACTAG 


TATTCATGTT 


4560 


GTAGCACCTT 


CAAAACCTAC TATTGTTGAT GAGAAAATTT ACAGTGTCTG TCAAAAAATT 


4620 


ATTCAAGAAA TTGAACAATA TTTTAGGATG AAGGTTGATG CAGTTGAGAT TGACTATCTT 


4680 


TATCAATACG 


TTGTATCTTC 


GAGATTGCAA 


AAACCATTTT 


CTTCCGGGAA 


GCTTCCTTTT 


4740 


TCTCAGCGAG 


TTTTAGATGT 


CACTCATTAC 


TATTTTAGCC 


GTATGTGTAT 


GGACAATAGA 


4800 


GAGATTGAAA 


CGACAGATCC 


TGACTTTGTT 


GACTTGGCGA 


GTCATATCAG 


TCCCTTACTG 


4860 


AGGAGATTAG 


ATAATAGAGT 


ACAGATTAAG 


AATAGTCTTT 


TATCACAAAT 


TCTTTTAACC 


4920 


TATCCTAATC 


TGGTTAAAGA 


GTTAACAACT 


ATTTCTAAAG 


AAGTGAGTCT 


AGTATTTGGT 


4980 


TTTGCTTCCT 


TGAGTCTGGA 


CGAGATTGGT 


TTTCTAGTCT 


TATATTTTGC 


ACGGTTTCAA 


5040 


GAAAAGCGAG 


CACGTCCTCT 


AAAAACAGTA 


GTGATGTGTA 


CATCAGGTGT 


CGGAACTTCA 


5100 


GAGCTTTTAC 


GAGCACGATT 


AGAAAAGCAA 


TTTTCTGAAT 


TGGATATTAT 


TGATGTAGTT 


5160 


GCTTATCATC 


AATTAGATGA 


GCTGATAAAT 


CTATATCCAG 


ATTTAGATTT 


CATTGTGACG 


5220 


ACGGTAGCTT 


TGCAGGAACC AGCAAGTGTC CCGTTTGTCC TAGTTAGTGT TTTTCTAACC 


5280 


GAGGGTGATA 


AACAACGTCT 


TCAAGCAAAA 


ATTCAGGAGA 


TAAACTATGA 


ATAATCTTTC 


5340 


GCTTGTCCTT 


ATGGATATAT 


CTGTTCAAAA 


TCGTCAAGAA 


GCCTACAAAG 


AATTAGCAAA 


5*00 


TCAAATCAGC 


CTTCTTGTTT 


CTGAAGATAC 


AGAAAAAATA 


GAAGAGCTTC 


TATATTACCG 


5460 


TGAGAGACAG 


GGAAGTATAG 


AGGTTGCTAA 


AGGTGTTCTT 


CTACCACATT 


GTGAAGGAAA 


5520 


CTTTCAACAT CATGTCTTAG TGATTACTAG ATTAAAATCA CCTATCAGAG 


AATGGTCGAA 


5580 


GGATATCCAG 


TGTGTTGACC 


TTATTATCGG 


TTTGGCCATT 


GCAGTATCAC 


AGGACAAGTC 


5640 


ATGTATTAAA 


ACATTGATGA 


GAAGACTAGC 


AGATGAATCA 


TTCATAAATC 


AATTAAAACA 


5700 


GTTAACAAAA 


GAAGAATTAC 


GGGAGATAAT 


ATATGGAAAT 


CAAAGATATT 


CTTAATGTGA 


5760 


GTCTGATCCA 


GACGGATTTA 


CAGATGCAGA 


GCAAAGAAGA 


GGTTTTTGAG 


GCATTAGCTC 


5820 


AACTATTGGT 


TGAGACGGGT 


TATGTGTCTG 


ATAGAGACCA 


ATTTATCGAA 


GGTCTTTATC 


5880 


AGAGAGAGGC 


AGAAGGACAG 


ACCGGTATTG 


GGAATTATAT 


TGCTATTCCC 


CATAGCAAGA 


5940 


GTTCTGCTGT 


GGAGAAGGCG 


GGGGTAGTCA 


TAGCTATAAA 


TCACAATGAG 


ATTCCTTGGG 


6000 


AGACCATTGA 


TGGGAAAGGG 


GTCAAAGTAA 


TTGTACTCTT 


TGCAGTTGGT 


GATGATACAG 


6060 


AAGCTGCTAG GGAGCATTTG AAGACCTTAT CACTCTTTGC TCGAAAACTT GGTAATGACG 


6120 


AAGTTGTTGC 


CAAATTAGTT 


CGGGCTCAGA 


CATCTGATGA 


TGTGATTGCA 


GCTTTTTGTT 


6180 


AATAAGAAAA 


AATTTTGGAG 


GGTATCCGTA 


TGAAAATTGT 


TGGTGTTGCA GCTTGTACTG 


6240 


TGGGAATTGC 


CCACACTTAT ATTGCACAGG AAAAATTAGA GAATGCCGCA AAGGTAGCTG 


6300 
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GACATGTGAT TCATGTTGAG 


ACTCAGGGGA 


CAATAGGGGT 


AGAAAATGAA 


TTGAGTCAAG 


. 6360 


AGCAGATTGA 


TGCAGCGGAT 


GTAGTTATTT 


TAGCAGTTGA 


TGTTAAGATT 


TCTGGTATGG 


6420 


AACGCTTTGA GGGTAAAAAG 


ATTATCAAGG 


TTCCAACAGA 


AGTGGCAGTC 


AAATCTCCCA 


6480 


ATAAACTGAT 


TGCTAAAGCT 


GTTGAGATTG 


TTACGAAATA 


ACTGAAAATA 


TTTAAGGAGA 


6540 


AAATATATGT 


TGAAACACTT 


AAACTTAAAA 


GGTCACTTAT 


TGACAGCCAT 


TTCCTATATG 


6600 


ATTCCAATTG 


TTTGTGGTGC 


AGGATTCTTA 


GTTGCCATTG 


GTTTAGCAAT 


GGGGGGTGGT 


6660 


GTTCCTGACG 


CTCTTGTAGC 


AGGAAAATTC 


ACTATCTGGG 


ATGCTTTAGC 


AACTATGGGT 


6720 


GGTAAAGCCC 


TTGGTCTCTT 


GCCAGTTGTT 


ATTGCTACAG 


GTTTGTCTTA 


CTCGATTGCT 


6780 


GGTAAGCCAG 


GGATTGCACC 


AGGTTTTGTT 


GTTGGTCTAA 


TTGCCAATTC 


TGTTGGTTCA 


6840 


GGGTTTATCG 


GTGGTATCTT 


GGGAGGTTAT 


ATAGCTGGTT 


TCTTGGTTCA 


AGCGATTATT 


6900 


AAAAAGGTCA 


AAGTACCAAA 


CTGGATTAAA 


GGTTTAATGC 


CAACCTTGAT 


TATTCCTTTT 


' 6960 


GTAGCCTCTT 


TGGTAAGTAG 


TTTGATTATG 


ATTTATATTA 


TTGGAGCGCC 


TATCGCAGCC 


7020 


TTTACCAACT 


GGTTGACGAG 


CTTATTACAA 


AGCTTGGGAA 


GTGCTTCAAA 


TGGTTTGATG 


7080 


GGGGCAGTTA 


TTGGAATTCT 


CAGTGCTGTT 


GACTTTGGTG 


GCCCACTTAA 


TAAAACAGTC 


7140 


TATGCGTTTG 


TGTTGACTTT 


ACAGGCTGAA 


GGTGTGAAAG 


AACCATTGAC 


TGCTTTACAA 


7200 


TTGGTGAATA 


CTGCTACACC 


AGTTGGATTT 


GGATTGGCCT 


ATTTTATCGC 


GAAATTACTC 


7260 


AAAAAAAATA 


TCTATACTCA 


AGAGGAAATC 


GAAACATTGA 


AATCGGCTGT 


TCCTATGGGG 


7320 


ATTGTCAATA 


TTGTTGAAGG 


TGTAATTCCG 


ATTGTTATGA 


ATAACTTGGT 


TCCAGGTCTC 


7380 


ATTGCAACAG 


GTATCGGTGG 


TGCTGTTGGT 


GGTGCTGTTT 


CTTTGACAAT 


GGGTGCTGAT 


7440 


TCTGCTGTGC 


CATTTGGTGG 


AGTGCTTATG 


TTACCAACCA 


TGACTCGTCC 


AGTAGCTGGT 


7500 


ATTTGTGCCT 


TGTTAGCTAA 


CATTGTAGTC 


ACAGGACTTG 


TCTACGCGAT 


TTTGAAAAAA 


7560 


CCAATAAAAC ATGCAGAACC 


AGTTATGACT 


GTTGAAGAAG 


AGATTGATTT 


GTCAGATATT 


7620 


GAAATTTTGT 


AAGAGGGTAA 


CGATGTCAAG 


AATTGAATTT 


TCACCATCTT 


TGATGACCAT 


7680 


GGATTTGGAC 


AAATTCAAAG 


AGCAGATTAC 


TTTTTTGAAT 


GATAAAGTAG 


CATCTTATCA 


7740 


TATCGATATT 


ATGGATGGCC 


ATTTTGTTCC 


CAATATTACC 


TTGTCTCCTT 


GGTTCATTCA 


7800 


AGAAGTTCAA AAAATTAGTG 


ACACACCTTT 


ATCAGTTCAT 


CTGATGGTCA 


CAGACCCAAC 


7860 


CTTTTGGGTA GATCAAGTTC 


TCGATTTACA 


ATGTGAGTAT 


ATTTGTATTC 


ATGCTGAAGT 


7920 


TCTGAATGGT 


CTTGCTTTTC 


GTTTGATTGA 


TAAAATTCAT 


GATGCAGGTC 


TAAAGGCTGG 


7980 


TGTTGTCCTT 


AATCCTGAAA 


CACCTGTTTC 


TACAATCTTT 


CCCTACATTG 


ATTTACTTGA 


8040 
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CAAAGCAACT ATTATGACTG TAGATCCAGG TTTTGCAGGA CAACGCTTTT TGGAGTCTAC 8100 

CTTGTATAAA ATCCAAGAAC TCCGTCAGCT TAGAGTTCAG AATGGTTATC ACTACATCAT 8160 

TGAGATGGAT GGTTCTTCGA GTCGTAAGAC TTTCAAACAA ATTGATGTGG CAGGACCAGA 8220 

TATTTATGTT ATAGGTCGCA GTGGATTATT TGGTTTGGAT GACGATATTG CCAAAGCCTG 8280 

GGATATCTGT TCTAGAGATT ACGAAGAAAT GACCGGAAAA ACAATGCCAA TCAAATAATG 8340 

GTTTGAGAAG AAATTTATTA GTTAGGAGGA ATATATGTCA CTACAATCAG TTAACGCCAT 8400 

TCGTTTTCTT GGCGTAGATG CTATTAACAA ATCTAATTCT GGTCACCCGG GAATTGTCAT 8460 

GGGTGCTGCG CCAATGGCTT ATAGCCTATT TACAAAGCAC CTTAGAATTA CACCTGAGCA 8520 

GCCAAACTGG ATTAACCGAG ATCGCTTTAT CTTGTCTGCG GGTCATGGAT CAATGCTACT 8580 

GTATGCTCTC TTGCATTTAA CAGGGTATAA GGATGTATCC ATGGACGAGA TTAAAAATTT 8640 

CCGGCAATGG GGATCTAAGA CACCTGGTCA TCCTGAAGTG ACGCATACGT CTGGTGTGGA 8700 

TGCGACATCT GGTCCGCTTG GTCAGGGGAT TTCTACTGCC GTTGGTTTCG CCCAAGCAGA 8760 

GCGTTTTTTA GCTGCTAAGT ACAACAAAGA TGGTTTCCCT ATTTTTGACC ATTATACTTA 8820 

TGTTATCGCT GGAGACGGTG ACTTCATGGA AGGAGTGTCT GCGGAGGCGG CTTCTTATGC 8880 

AGGTCATCAA GCTTTAGATA AGCTTATCGT CCTCTACGAC TCCAACGACA TCTGCTTGGA 8940 

TGGTGAGACC AAAGATACTT TCTCTGAAAA TGTTCGCGTC CGTTACGATG CTTATGGTTG 9000 

GCATACAGTT CTGGTAGAAG ATGGAACAGA TTTAGCAGCA ATTTCTACAG CAATTGAGAC 9060 

GGCCAAGTTT TCTGGTAAAC CGAGTTTGAT TGAAGTGAAA ACGGTAATTG GTTACGGCTC 9120 

ACCCAATAAA AGTGGTACAA ATGCTGTTCA TGGTGCACCA CTAGGAGCAG AAGAAACAGG 9180 

AGCAACTCGT AAGTTTTTGG GATGGGATTA CGATCCATTT GAAGTACCAG AGGAAGTATA 9240 

TTCTGATTTC AAGACAAATG TAGCGGATCG TGGTCAGGAG GCATACGATG CTTGGGCTAG 9300 

TTTGGTGTCT GATTACAAGG TTGCTTATCC CGAAGTTGCT AGTGAGATTG ACGCTATTGT 9360 

AGCTGGAAAA TCCCCTGTAA CCATTACTGA AAAAGACTTC CCTGTCTATG AGAATGGCTT 9420 

CTCTCAAGCA ACTCGTAATT CGTCCCAAGA TGCTATTAAT ACAGCAGCAG TTTTACCAAC 9480 

CTTCTTAGGT GGATCGGCAG ACTTAGCTCA CTCTAACATG ACCTACATCA AGGCAGATGG 9540 

CTTACAAGAT AAATATAATC CATTAAACCG CAATATTCAG TTTGGGGTAC GTGAATTTGC 9600 

CATGGGAACA ATCCTCAATG GAATGGCTCT TCATGGTGGT TTACGAGTTT ATGGCGGAAC 9660 

CTTCTTTGTT TTCTCTGACT ACGTCAAAGC TGCTATTCGG CTATCAGCCA TTCAGGAGTT 9720 

GCCTGTAACT TATGTCTTTA CCCATGATTC AATTGCCGTT GGTGAAGATG GTCCAACTCA 9780 

TGAACCAGTT GAACATTTGG CAGGTTTACG CTCAATGCCA AACTTGACTG TTATCCGTCC 9840 
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AGCGGATGCC 


CGTGAAACTC AAGCGGCTTG GCATCATGCC 


TTGACCAGTA 


CCACCACTCC 


9900 


AACTGTCATT 


GTCTTAACCC GTCAAAACTT GGTAGTTGAA 


GAAGGGACAG 


ACTTTGGTAA 


9960 


GGTCGCTAAA 


GGAGCCTACG TCGTGTATGA TACCCCGGGA 


TTTGATACTA 


TTATCATTGC 


10020 


TACAGGATCT 


GAGGTCAATC TAGCTATCAA AGCTGCTAAG 


GAATTGGTTT 


TACAAGGTGG 


10080 


TAAAGTACGT 


GTGGTATCTA TGCCCTCAAC CGAACTATTT GATGCTCAAG ATGCTACCTA 


10140 


CAAGGAAGAC 


ATTTTACCAT CTAAGACTCG TCGTCGTGTG 


GCCATTGAAA TGGCAGCGAC 


10200 


CCAAAGTTGG 


TACAAGTATG TTGGTTTGGA TGGCGCGGTC 


ATCGGTATTG 


ACATCTTCGG 


10260 


TGCGTCTGCC 


CCAGCTCAGA CTGTGATTGA TAATTATGGA 


TTTACGGTAG 


AGAATATCGT 


10320 


TGCTCAAGTT 


AAGTCCCTAT AGAAACCAAT TACAATGAAG 


ATACAGCTGT 


TGTCAGACTA 


10380 


GCAGATGTAG 


TGATAGACAC TAATCAGATG ATTGGTTATT 


TAAAAACTGT 


AATGAAAATG 


10440 


TAATAATTTA 


TCTACGAAAG TTATAGTAGA TAGTATACAC 


AATAGAGTAT 


ACCCTGAAAC 


10500 


GGTTGCGAAG 


TACGCTAATC ACTTTGCTAC TGATCTAGAT 


AGTTTCTTTA 


ATCAATAAAC 


10560 


ACAGCATCCA 


CAGATTGACT TAGGATATTG TAAGTTTTTT 


GAAAGCTAGA 


GAGAAGGTCT 


1062O 


CTAAAATTAA AAAACGCATA GTATAGGATG TTGAAATGAT 


GAACTGCACC 


CCAAAAGTTA 


10680 


GACAGAAAAA 


AATCTAACTT TTGGGGTGTT TTTATTATGA 


AATTAACTTA 


TGATGATAAA 


10740 


GTTCAGTTCT 


ATGAACTTAG AAAACAAGGA TATATCTTAG 


AGAAGCTTTC 


AAATAAATTT 


10800 


GGGATAAATA 


ATTCTAATCT TAGGTACATG ATTAAATTGA 


TTGATCGTTA 


CGGAATAGAG 


10860 


TTCGTCAAAA 


AAGGGAAAAA TCGTTACTAT TCTCCTGATT TAAAACAAGA AATGATTCAT 


10920 


AAAGTCTGAC 


ATGAAGGCTG GACTAAAGAT AGAGTTTCTC 


TTGAATACGG 


TCTCCCAAGT 


10980 


CGTACGATAC 


TTCTTAACTG GCTAGCACAA TACAGGAAAA 


ACGGGTATAC 


TATTGTTGAG 


11040 


AAAACAAAAG 


GGAGAGTACC TGAGAGCGGA GAATGCCATC 


CTAAAAAAGT 


TAAGAGAACT 


11100 


CCGATTGAAG 


GAGGAAAAAG AGAAATAAGA AAGACAGAAA TTGTTCAAGA ATTAATGACT 


11160 


GAGTTTTCGT 


TAGATCTTCT TCTAAAAGCC ATTAAACTAG 


CTCGTTGGAC 


CTACTACTAT 


11220 


CACTTGAAAC 


AGCTAGATAA ACCAGATAAG GACCAAGAGC 


TTAAAGCTGA 


AATTCAATCC 


11280 


ATCTTTATCG 


AACACAAGGG AGATTATGCT TATCGCCGGG 


TTCATTTAGA ACTAAGAAAT 


11340 


CGTGCTTATC TGGTAAATCA TAAAAGAGTT CAAGGCTTGA TGAAAGTACT CAATTTACAA 


11400 


GCTAGAATGC 


GACAGnAACG AAAATATTCT TCTCATAAAG 


GAG 




11443 


(2) INFORMATION FOR SEQ ID NO: 50: 









U> SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 5338 base pairs 
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(B) TYPE: nucleic acid 
<C) STRANDEBNESS : double 
(D) TOPOLOGY: linear 



txx/ DisyuENtfi DESCRIPTION: SEQ ID NO: 


50; 






cthfti lALAi TATATTATCA AAATCGTCGA AACTGGCTCC 


ATGAATGAGG 


CAGCCAAGCA 


60 


ACTCTTTATC ACTCAGCCAA GTCTCTCCAA TGCAGTGCGA 


GATTTGGAAA 


ATGAAATGGG 


120 


CATTGAGATC TTTATCCGCA ATCCCAAGGG AATCACCTTG 


ACCCGTGATG 


GCATGGAGTT 


180 


TCTCTCTTAT GCCCGTCAGG TTGTCGAGCA GACCCAGCTT 


CTGGAGGAAC 


GCTATAAAAA 


240 


TCCTGTCGCC CACCGCGAAC TCTTTAGCGT TTCGTCTCAA 


CACTATGCCT 


TTGTGGTCAA 


300 


TGCCTTTGTC TCTTTGCTCA AGAAAAGCGA TATGGAGAAA 


TACGAACTCT 


TCCTTCGTGA 


360 


AACTCGGACT TGGGAGATTA TCGACGACGT CAAGAACTTC 


CGCAGTGAGG 


TCGGGGTCCT 


420 


CTTCTTAAAC AGTTACAACC GTGATGTTTT AACCAAGATG 


CTGGATGACA 


ATCACCTGCT 


480 


AGCCCACCAT CTCTTCACAG CGCAACCGCA TATCTTTGTC 


AGCAAGACCA 


ACCCTCTGGC 


540 


AAAGAAAGAC AAGGTGAAAC TGTCTGATTT GGAGAATTTC 


CCTTACCTCA 


GCTATGACCA 


600 


AGGGACGCAC AACTCCTTCT ACTTTTCAGA AGAGATTCTT 


TCTCAAGAAC 


ACCACAAGAA 


660 


ATCCATTGTG GTCAGTGACC GTGCCACCCT CTTTAATCTC 


TTGATTGGTT 


TGGATGGTTA 


720 


TACCATTGCG ACAGGGATTT TGAACAGCAA CCTAAACGGA 


GACAATATCG 


TTTCTATCCC 


780 


ACTGGATATT GATGACCCGA TCGAGCTGGT CTATATCCAG 


CATGAGAAAA 


CCAGCCTATC 


840 


TAAGATGGGC GAACGCTTTA TAGACTATCT CCTAGAAGAA 


GTTCAGTTTG 


ATAGTTGAGA 


900 


AATGATAAGA ACCAATATGT AGGCTAGCAA CAACCTGCAC 


ATTGGTTCTT 


TTTACTTATA 


960 


ATTAAAAGTT TCCCCTGCCA ACTTATCAGC TAGCTTGGGA 


AAGAGAGTAT 


AAAACTTATG 


1020 


GGCTAGGTTC AACAAAATCG GGAGATTGAG TTCTCGTTTG 


TTTTTTCCTA 


TAATCTTGAC 


1080 


AATCTTTTTA GCCACTGCAT CTGGTTCTAG CAGGAAGCGA 


TCAACCGATT 


TAAGATAAGT 


1140 


TCCATCTGGG TCGGCTTGGT CGAAAAATCC TGTACGGATT 


GGTCCTGGAT 


TGACTGTTGT 


1200 


CACATAGACT CCATAGGGCA TAAGTTCGAG TCGCAGAGCA TTTGAAAAAC 


CAATAGCCGC 


1260 


AAACTTGGTC GCTGAGTAAA GACTAGACTT GCCAGTAGCT 


ATTAGACCTG 


CCATGCTGAC 


1320 


GATGTTGATG ATATGCCCTT TGCTGCTTTC CTTCATACGA 


GCCGCAAGGT 


GACGAGACAG 


1380 


ATT CATC AGG GCAAAGGTAT TGACCTCAAA CATCTGGTGA 


ATATCTTTAT 


CAGCAATCTG 


1440 


GTCAAATCCC TCAAAAATCC CGTAACCAGC GTTGTTAATC 


AAGACATCAA 


TCTTGCCATA 


1500 


GCGGAGATAA AGATCAGTTA CCAGAGCTTC TAGGGCTGAA 


TCGTCGGTAA 


TATCAATTTC 


1560 
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AATCAATTCT 


GCATGGGAAT AATTTCCGTA GAGTTGGGCT AATTTTTCCT TATTTCTACC 


1620 


AAGCAAGATG 


AGTTGGTCAT TGGGCAGGAG TTTGACCATT TCTTGAGCTA GACCACCGCT 


1680 


AGCTCCGGTA 


ATGAGAATAG TAGGCATACT TATCCTTTCT GTGACTGCTA GATTTCCACT 


1740 


TCTTCCAAGT 


CTTTGACCAC ATGGACATTT TCAAAAATTG TGGCAGCGTC TTTCTTGAGT 


1800 


TTGCTAATAT 


CTTTTGAGAG GAAACGGGCA CTGATATGGT TGAGTAGGAG GCGTTTGGCA 


1860 


CCTGCTTCTA 


CCGCTACTTG TGCAGCTTGC ATATTAGTTG AGTGACCATG GTTACGAGCA 


1920 


ATTTTTTCAT 


CACCCTTGCC ATAAGTGGAC TCATGAACTA GGACATCTGC ATTGACAGCC 


1980 


AGACGCACAC 


TGGCACCCGT TTTTCGAGTG TCTCCTAAAA TAGTGATAAT CTTACCTGGA 


2040 


CGTGGCGCTG 


AGATATAGTC TGCTGCCTTG ATTTCAGTTC CGTCTTCCAA AACAAGATCC 


2100 


TGGCCGTTTT 


TGATTTTACC AAAAAGCGGG CCGAACGGAA CACCAGCAGC CTTGAGTTTT 


2160 


TCAGCATCCA 


GCGTCCCTTC TAGATCCTTT TGCATGACAC GATAGCCAAC ACAGAAAATA 


2220 


GTGTGGTCCA 


GCTCCTCTGC ATACACAGTG AATTTATCGG TTTCAAGAAT TTTACCCAGA 


2280 


GAATCTTGGT 


CAAACTCATG GAAATGAATG CGGTAGGGCA GACGAGAACC TGACACACGA 


2340 


AGGCTGGTTA 


AGACAAATGA CTTGATTCCT TGAGGTCCGT AGATTTCCAA ATCTGTCTGC 


2400 


TCTTCATTGG 


CCTGAAAGGC ACGGCTAGAA AGGAAACCTG GCAAACCAAA AATGTGGTGT 


2460 


CCATGCAGAT 


GGGTAATAAA GATTTTGCTG ACCTTACGTG GTCGAATTGT GGTTTCCAGA 


2520 


ATGCGATTTT 


GCGTACCTTC TCCACAGTCA AAGAGCCAAA CTTCGTTAAT CTCATCCAAA 


2580 


AGTTTCAGGG 


CGAGACTTGA AACGTTGCGG GCTTTAGAGG GCTGACCAGC CCCCGTTCCT 


2640 


AAAAATTGAA 


TATCCATTCG ATACTTTCTA ATTAATCAAT ATATAACATG GCTGTGCGGT 


2700 


TTTCCGATCG 


GAAATAGCGT TTGCCAGAAA AAGCAGCAGC TTCTTGCAAT AAATCCTCTT 


2760 


GGCTGTAGCC 


TTTGAGACGT TTTCGACCAT CAGCCAATCT TTCCAAATCA GTCAAAGCTG 


2820 


TGAGACTTTC 


TAGGCTGATA ACTTCCTCGT CCTCGACAGG CTTCATGTAA ATCTTACCAG 


2880 


ACTCTTCAAA 


GACTAATTGA TGGGGGAAAA TTTGCGCAAT TTCAAAGAGC AAGTCATCCG 


2940 


AGATTTTCTC 


CTCATTTTCA AAGAAAATCC GACCAAGGCC GTCACTCTCA TAACAAAAAC 


3000 


CAAAGGATTT 


ACCAGACAGA TTAAGCCGAA TAAAAGGCTT ATTTTCTAGG GTGAAACTTG 


3060 


GCTCAGTATT 


GTAAAGATTC AGTTCCTGAC TGAGTTCTGC AAAATAATCC GTCGCAGCCT 


3120 


GAGGACTCTT 


TTTCTGATAG AGTTCTGCAA AGTAGGCATT AACAACACTT GGCGGAGGTG 


3180 


TAATAAGTGT 


TAACTGCTCC TGATCTGTTT TACCAGCTAG AAGCTGATGC AGATAGACCT 


3240 


TGTCCAGACT 


TGTATAACCT CCATACTTTA GAGCCAAAGT TTTAATATCA GTCATAAAAT 


3300 
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TCTTCTAACC TCCATTTATT TTTCTCGGAA ATGTAGCCTG TAATCACTTC GCCGTCTTCC 3360 

TGATAATCAC GTTCTTCCAG AATTGCAACA CTCTCTAAAT CATGAATCTT GTAGGACTTT 3420 

GAAAAAGGCA CTCGCAGGGT AAATGCTTCA AAAATTTCCT TAATCTTATC TAGCAATAAT 3480 

GCTTGCAAGT TTTCACGACT GTCCTCAGAC TTGGCAGAAA TGAGGGTATA TGGCGTTTGG 3540 

GTAGGCGTGA AATCCTCCAC CAAATCCGCT TTATTATAAA GCGTCAAGTG AGGAATATCT 3600 

TCCATGTCCA GGTCTTTCAT GATGGAGAGA ACCGTTTTTT CATGCTCCTC GTGGTAAGGA 3660 

TTGCTAGCAT CGATAACATG AACCAGAAGG TCCACATGCT TGCTTTCTTC CAAGGTTGAC 3720 

TTGAAACTGG ACACCAACTC TGTCGGCAAA TCTTGGATAA AGCCAACGGT ATCTGTCAAA 3780 

GTTACTTGGA GATTGCCTCC CAGATGAATA CTCTTGGTTG TCGCATCCAG AGTCGCAAAG 3840 

AGCTCATCTG CTTCATACTG GGTCTTACTG GTCAAGATGT TCATGATAGT TGATTTCCCA 3900 

GCATTAGTAT AACCAATCAA ACCAATCTTA AAAGTGCTAG ACTCCAAACG TTTTTCTCTG 3960 

ACAGTCGCAC GATTTTTCTC AACCACCTTG AGCTGGCGCT CGATATCCGT GATTTGATTG 4020 

CGAACGCTAC GACGGTTCAG CTCCAGCTGG CTTTCACCAG GACCACGGGA ACCAATTCCC 4080 

CCTgCCTGAC GGCTGAGCAT «- AATCCCCTG A CCAACCAAGC GAGGCAAAAG GTATTTGAGT 4140 

TGGGCTAGGT GGACTTGGAG CTTCCCTTCA TGGCTTCGAG CCCGCATGGC AAAGATATCC 4200 

AAAATCAACT GCATACGGTC AATGACCTTA ACACCGAGAA CTTCCTCTAG ATTGACATTC 42 60 

TGCCTTGGGG TCAGACGATT GTTGACGATG ACAGTAGTGA TTTCTTCTGC ATCCACCATA 4320 

AGCGCAATCT CTTCCAACTT ACCAGAGCCG ACGAAGGTCT TGGAATCATA TTTTTCACGT 4380 

TTTTGTCTGT AGCTATCTAC AACGACTGCC CCTGCCGTTT TCGCTAAACT AGCCAATTCT 4440 

TCCATGGAGA GGTCAAAACT GTCCATACCC TGCAATTCCA CACCAATCAG CAGGACTCGC 4500 

TCCTCTTTTT TCTCCGTTTC AATCATCTAA AAACTCCTCT ATCTGGCTTA AAATGCGGTC 4560 

TTGTACACCA GATTCTCCAA TCTGATAAAA GGTGACCTGC ATGCGATTAC GGAACCAGGT 4620 

CAGCTGACGC TTGGCAAAAC GACGAGTCGC CTGTTTAAGA CTCTCACTAG CTTCCTCCAA 4680 

GGTCTGCTCT CCACGGAAAT AAGGAAAGAG TTCCTTATAG CCAATTCCTT TAGCAGCCTG 4740 

TACATTAGGG GAATGGTCAA ACAGCCACTT GGCCTCATCC AAAAGCCCAG CCTCAAACAT 4800 

CAAATCCACT CGGTGGTTGA TACGCTCATA AAGTTGACTA CGTTCATCAT CCAAGCAGAT 4860 

AATCAGCGGT TCATACAAGG TCTCTTGATT TTCCAAATCC TGACCAAAAT GGGCAATTTC 4920 

TAAGGCACGC ATAGCACGAC GACGATTAAA CTGGGGAATC TCAAGGCCTG CTTGATCC AC 4980 

CAAATGGGCT AATTCCTCAT CTGAATATGG CTCCAAACTA GCTCGATAAG CTAAAATCTC 5040 

CTCATGAGGA GTCTCCCCAC CTAGGTGGTA ACCTTCTAGC AAGCTCTGGA TATAAAGTCC 5100 
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AGTCCCACCG GCGATAATGG CTAGCTTGCC ACGGTTGTGA ATACCCTCAA TAGTCATCTT 5160 

AGCTTCTGAA ACAAAATCAA AAGCCGAGTA AGACTCGGTT ATCTCTCTAA CATCGATTAA 5220 

ATGATGAGGA ACAGCTGCCT GCTCTTCTGG ACTAGCCTTG GCCGTCCCAA TATCAAGTCG 5280 

TCGATAGACT TGCTGGCTAT CTCCACTAAC CACTTCGCCA TTAAAACGCT TTGCGGGG 5338 
(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 19446 base pairs 

(B) TYPE: nucleic acid . 
{C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

CGGAAACCCA TCTAGTCTCC ATCGTTTGGG AGACCAAGCA ACACGAATCT TAGATGCTTC 60 

TCGCCAACAG ATTGCAGATT TAATCGGTAA GAAAAGCGAT GAAATCTTCT TTACCTCGGG 120 

TGGAACAGAA GGGGATAACT GGCTTATCAA GGGTGTGGCC TTTGAAAAAG CTCAGTTTGG 180 

CAAGCACATC ATTGTTTCAG CCATTGAACA TCCAGCAGTC AAAGAGTCAG CCCTCTGGTT 240 

GAAAAGTCAA GGATTTGAAG TGGATTTTGC TCCAGTTGAT AAGAAAGGCT TGGTCGATGT 300 

TGAGGCGTTA CAGGTTTGAT ACGGCATGAT ACAATCCTCG TTTCCATCAT GGCTGTGAAC 360 

AATGAAATCG GCTCTATCCA ACCTATTGAG GCTATTTCAG AATTCTTGGC AGACAAGCCG 420 

ACTATTTCCT TCCACGTTGA TGCGGTTCAG GCGCTTGCCA AAATTCCGAC TGAAAAGTAT 480 

CTGACAGAAC GGGTGGATTG CGCGACTTTC TCTAGTCACA AGTTCCACGG GGTTCGAGGT 540 

GTTGGCTTTG TCTATATCAA ATCTGGCAAG AAGATTACAC CTCTTCTTAC AGGTGGTGGC 600 

CAGGAGCGAG ATTATCGTTC GACAACTGAA AATGTGGCAG GGATTGCAGC GACAGCCAAG 660 

GCCCTCCGTT TGTCTATGGA AAAGCTAGAT ATCTTTAGGA GCAAGACTGG GCAGATGAAG 720 

GCAGTGATTC GCCAAGCTCT TCTGAACTAT CCGGATATTT TTGTCTTTTC AGATGAGGAA 780 

AACTTTGCAC CTCATATTCT GACTTTTGGA ATCAAAGGTG TTCGAGGTGA AGTCATCGTT 840 

CACGCCTTTG AAGACTATGA TATTTTCATC TCAACAACCT CAGCTTGTTC ATCTAAGGCA 900 

GGAAAACCAG CCGGTACCTT GATTGCCATG GGAGTGGACA AAGATAAGGC CAAGTCAGCT 960 

GTGCGTCTTA GCCTAGACTT GGAAAATGAT ATGAGTCAGG TCGAGCAGTT TTTGACCAAG 1020 

TTAAAATTGA TTTACAATCA AACTAGAAAA GTAAGATAGG AGCATTCATG CAGTATTCAG 1080 

AAATTATGAT TCGCTACGGA GAGTTGTCAA CCAAGGGTAA AAACCGTATG CGTTTCATCA 1140 
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ATAAACTTCG TAATAATATT TCGGACGTTT TGTCTATCTA TACCCAAGTT AAGGTAACAG 1200 

CAGATCGCGA CCGTGCCCAC GCTTACCTCA ATGGAGCTGA TTACACAGCA GTTGCAGAAT 1260 

CTGTCAAACA AGTTTTTGGA ATTCAAAACT TTTCTCCTGT TTATAAGGTT GAAAAATCTG 1320 

TAGAAGTTTT GAAGTCTTCT GTCCAAGAGA TTATGCGGGA CATCTACAAG GAAGGTATGA 1380 

CCTTTAAGAT TTCTAGCAAG CGTAGCGACC ACAACTTTGA ACTTGATAGT CGTGAACTCA 1440 

ACCAAACACT TGGAGGGGCT GTATTCGAAG CCATTCCAAA TGTGCAAGTT CAAATGAAAA 1500 

GTCCTGACAT CAATCTTCAG GTGGAGATTC GTGAAGAAGC AGCCTATCTT TCTTATGAAA 1560 

CCATTCGTGG GGCTGGTGGT TTGCCAGTTG GAACTTCAGG TAAAGGGATG CTCATGTTGT 1620 

CAGGAGGGAT TGACTCACCT GTAGCAGGTT ATCTTGCTCT TAAGCGTGGG GTGGATATCG 1680 

AGGCAGTTCA CTTTGCTAGT CCACCATATA CTAGTCCTGG TGCCCTCAAG AAAGCGCAGG 1740 

ACTTGACCCG TAAATTGACC AAGTTTGGCG GAAATATCCA GTTTATAGAG GTGCCTTTCA 1800 

CAGAGATTCA AGAGGAAATC AAAGCCAAAG CGCCAGAAGC TTATTTGATG ACTCTAACTC 1860 

GTCGCTTTAT GATGCGGATT ACTGACCGTA TTCGTGAGGT ACGAAATGGT TTGGTTATCA 1920 

TCAATGGGGA AAGTCTAGGT CAAGTAGCCA GCCAAACCCT TGAAAGTATG AAGGCTATCA 1980 

ATGCTGTTAC CAACACTCCC ATCATTCGTC CTGTGGTTAC CATGGACAAG TTGGAAATCA 2040 

TTGACATCGC CCAGGAAATC GATACCTTTG ACATTTCAAT CCAACCGTTT GAAGACTGTT 2100 

GTACCATTTT TGCACCAGAT CGTCCAAAAA CAAATCCTAA AATTAAGAAT GCGGAGCAGT 2160 

ACGAAGCGCG TATGGATGTT GAAGGCTTGG TTGAGCGAGC AGTGGCTGGA ATCATGATTA 2220 

CTGAAATCAC ACCTCAAGCC GAAAAAGATG AAGTTGATGA CTTGATTGAC AATCTGCTCT 2280 

AATTCAGAAA ATCCAAAAGA ATAGCGAAAA TCAGTAAAAA AAGTTAGTTT TTTCTCTAAA 2340 

AACAGGTAAA AAACTAACTT TTTTTATTTT TATGATATAA TGATATAAAA TTTTGAATAT 2400 

AGAGAGTTTT CTGACAATGA ATCAATCCTA CTTTTATCTA AAAATGAAAG AACACAAACT 2460 

CAAGGTTCCT TATACAGGTA AGGAGCGCCG TGTACGTATT CTTCTTCCTA AAGATTATGA 2520 

GAAAGATACA GACCGTTCCT ATCCTGTTGT ATACTTTCAT GACGGGCAAA ATGTTTTTAA 2580 

TAGCAAAGAG TCTTTCATTG GACATTCATG GAAGATTATC CCAGCTATCA AACGAAATCC 2640 

GGATATCAGT CGCATGATTG TCGTTGCTAT TGACAATGAT GGTATGGGGG GGATGAATGA 2700 

GTATGCGGCT TGGAAGTTCC AAGAATCTCC TATCCCAGGG CAGCAGTTTG GTGGTAAGGG 2760 

TGTGGAGTAT GCTGAGTTTG TCATGGAGGT GGTCAAGCCT TTTATCGATG AGACCTATCG 2820 

TACAAAAGCA GACTGCCAGC ATACGGCTAT GATTGGTTCC TCACTAGGAG GCAATATTAG 2880 

CCAGTTTATC GGTTTGGAAT ACCAAGACCA AATTGGTTGC TTGGGCGTTT TTTCATCTGC 2 940 
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AAACTGGCTC 


CACCAAGAAG 


CCTTTAACCG CTATTTCGAG 


TGCCAGAAAC 


TATCGCCTGA 


3000 


CCAGCGCATC 


TTCATCTATG TAGGAACAGA AGAAGCAGAT GATACAGACA AGACCTTGAT 


3060 


GGATGGCAAT 


ATCAAACAAG 


CCTATATCGA CTCGTCGCTT 


TGCTATTACC 


ATGATTTGAT 


3120 


AGCAGGGGGA 


GTACATCTGG 


ATAATCTTGT GCTAAAAGTT 


CAGTCTGGTG 


CCATCCATAG 


3180 


TGAAATCCCT 


TGGTCAGAAA 


ATCTACCAGA TTGTCTGAGA 


TTTTTTGCAG 


AAAAATGGTA 


3240 


AGTTAAGAAA 


GGAAAAAACG 


AAATGCATAT TGAACATCTT 


AGCCACTGGA 


GTGGTCATCT 


3300 


TAACCGTGAA 


ATGTACCTTA 


ACCGTTATGG ACATGGTGGG 


ATTCCAGTTG 


TGGTCTTTGC 


3360 


TTCATCAGGT 


GGTAGTCACA 


ACGAATACTA TGATTTTGGC 


ATGATTGATG 


CCTGTGCTTC 


3420 


CTTTATCGAG 


GAAGGCCTTG 


TCCAGTTCTT TACCCTATCT 


AGTTTGGATA 


GTGAGAGCTG 


3480 


GTTGGCTACT 


TGGAAAAATG CTCATGACCA AGCGGAAATG CACCGTGCCT ACGAACGTTA 


3540 


TGTGATTGAG 


GAGGCCATTC 


TTTTATCAAG CACAAGACAG 


GTTGGTTTGA 


TGGCATGATG 


3600 


ACGACAGGTT 


GCTCTATGGG 


AGCCTATCAT GCACTCAATT 


TCTTCCTCCA 


GCATCCAGAT 


3660 


GTCTTTACCA 


AAGTGATTGC 


TCTCAGTGGT GTTTACGACG 


CACGTTTCTT 


TGTCGGTGAT 


3720 


TACTACAACG 


ATGATGCTAT 


TTACCAAAAC TCGCCAGTAG 


ATTATATTTG 


GAACCAAAAC 


3780 


GACGGCTGGT 


TTATTGACCG 


TTACCGTCAG GCAGAGATTG 


TGCTGTGTAC 


GGGGCTTGGA 


3840 


GCCTGGGAAC 


AAGATGGTTT 


GCCATCCTTT TACAAGCTCA 


AAGAAGCCTT 


TGACAAGAAA 


3900 


CAAATTCCAG 


CCTGGTTTGC 


TGAATGGGGA CATGATGTCG 


CCCATGACTG 


GGAATGGTGG 


3960 


CGTAAACAAA 


TGCCTTATTT 


CCTCGGTAAT CTCTATTTAT 


AAAAGGAGTT 


ACCTATGAAT 


4020 


TACCTTGTTA 


TTTCTCCCTA 


CTATCCACAA AACTTTCAAC AGTTTACCAT 


CGAACTAGCT 


4080 


AATAAAGGCA 


TCACAGTCTT 


GGGAATTGGT CAAGAGTCTT 


ACGAGCAATT 


GGATGAGCCC 


4140 


TTGCGCAATA 


GCTTGACCGA 


GTATTTTCGT GTTGATAATC 


TTGAGAACAT 


AGATGAAGTC 


4200 


AAACGTGCAG 


TTGCTTTTCT 


CTTTTATAAA CATGGTCCAA 


TTGGCCGCAT 


CGAGTCTCAC 


4260 


AATGAATACT 


GGCTTGAGCT 


AGACGCAACA CTCAGAGAAC 


AATTCAATGT 


TTTTGGTGCC 


4320 


AAACCAGAGG 


ATCTCAAAAA 


GACGAAATAT AAGTCTGAAA 


TGAAGAAACT 


TTTCAAAAAA 


4380 


GCAGGTGTTC CTGTGGTACC TGGAGCTGTT ATCAAGACGG AAGCAGATGT TGATCAAGCA 


4440 


GTGAAAGAAA 


TCX5GTCTTCC 


AATGATTGCC AAACCTGATA 


ATGGAGTGGG 


AGCAGCCGCA 


4500 


ACCTTTAAAC 


TTGAGACAGA 


AGACGATATC AATCACTTCA 


AGCAAGAATG 


GGACCATTCA 


4560 


ACCCTTTATT 


TCTTTGAAAA 


ATTTGTCACT TCCAGCGAAA 


TCTGTACCTT 


TGACGGGCTC 


4620 


GTGGACAAGG 


ATGGAAAGAT 


TGTCTTCTCA ACAACCTTTG 


ACTACGCCTA 


TACACCGCTT 


4680 
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GACCTCATGA 


TTTATAAGAT GGACAATTCT TATTATGTGC TCAAGGATAT GGATCCTAAA 


4740 


CTGCGCAAGT 


ATGGGGAAGC AATTGTCAAA GAATTTGGTA TGAAAGAACG GTTTTTCCAT 


4800 


ATTGAGTTCT 


TCCGTGAGGG GGACGATTAT ATTACCATCG AGTACAATAA CCGCCCTGCA 


4860 


GGTGGTTTTA CCATTGATGT TTATAACTTT GCTCATTCCT TGGACCTTTA TCGTGGCTAT 


4920 


GCAGCTATTG TCGCAGGAGA GGAGTTCCCG GCGTCAGACT TTGAAACTCA GTATTGTTTG 


4980 


GCTACTTCTC 


GCCGTGCAAA TGCTCACTAT GTTT AT TCAG AAGAGGATTT GCTTGCCAAA 


5040 


TATAGCCAGC 


AGTTCAAGGT TAAAAAAGTC ATGCCAGCTG CCTTCGCGGA ACTTCAAGGA 


5100 


GATTACCTGT 


AT ATGC TGAC CACTCCGAGT CGACAAGAAA TGGAGCAGAT GATTGCAGAT 


5160 


TTCGGACAAC 


GTCAAGAATA AGAACTATCG GATTAAGGAA ATTAAPTrrv* tt a a <wr"wrn 


5220 


TGTTTTGTCT 


GATAAAAAAT AAGAGCATCC CAACAAGGTA nCTATCATaa iazr«v*rr"wr*^ 


5280 


ATAACTATTT 


GAAGCAGGAT TAGGTGGTCA GAAATTAAAT 1 TTTAATATt^P fAAiwnrmn 


5340 


ATAGTATTGT 


GTTTGCGTAT CCTTAAATCA GCTAAAAGGA TCCA'PttAffia rarrriTarr 


5400 


ATATAGTTTT 


CAAGATACCA AACAAGTCTA TTAATATTCA ATGAAAATCA AAGAGCAAAC 


5460 


TAGGAAGCTA 


GCCGCAGGTT T CTC AAAACA CTGTTTTGAG GTTGTfiGATA naarrnarar 


5520 


AGTCAGTATC 


ATATACTACG GCAAGGTGAA GCTGACGTGG TTTGAAGAGA TTTTCGAAGA 


.5580 


GTATAAAATA 


TTCAGGTGAC GCATAGATAT AGTTAATTGA AGCTTTGTTT GAAATCTGAT 


5640 


rwxnm. r\n x 


TATTACTAAG TTTTAAAAAC TAAAGAAAAG GGAAGATATG ATTACAGGCG 


5700 


AATTAAAAAA 


TAAAATCGAT CAGCTGTGGG AAATTCTTTG GACAGAAGGA AACGCAAATC 


5760 


CTTTAACAAA 


TATTGAACAG TTGACTTATC TCTTATTTAT GAAAGATTTG GATAGTGTCG 


5820 


AGCTTGGACG 


TGAAAGTGAT GCTGAATTTC TAGGGATTCC TTATGAGGGA GTTTTTCCAA 


5880 


AAGATAAACC 


TGAATACCGT TGGTCAACTT TTAAAAATAT AGGAGATGCT CAGGAAGTTT 


5940 


ATCGTTTAAT 


GACTCAGGAG ATTTTTCCGT TTATTAAAAA TCTCAAGGGG GATACAGATG 


6000 


ATACAGCCTT 


TTCACGATAT ATGCGAGAAG CTATTTTTCA AATAAATAAA CCTGCTACGC 


6060 


TTCAAAAGGC 


AATTTCTATC TTAGATGTTT TTCCAACTAG GGGATTAGAT GTAGATTTTG 


6120 


ATAATGACAA 


ACAAAGTATT ACTGATATCG GAGATATCTA TGAATATCTG TTATCAAAAT 


6180 


TGTCGACCGC 


AGGTAAAAAT GGACAGTTCC GTACACCTCG TCACATCATC GATATGATGG 


6240 


TTGAGTTGAT 


GCAACCGACT ATCAAAGATA TCATCTCAGA TCCCGCTATG GGTTCTGCTG 


6300 


GCTTCTTAGT 


ATCTGCTAGC CGTTACTTAA AGCGTAAGAA AGATGAATGG GAAACCAATA 


6360 


CAGATAATAT 


CAATCATTTT CATAATCAGA TGTTTCATGG AAATGATACG GATACGACTA 


6420 


TGTTGAGACT 


TGGGGCGATG AACATGATGC TACATGGAGT AGAAAATCCA CAAATCAGTT 


6480 
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ACCTTGACTC 


GCTGTCTCAA 


GATAATGAAG 


AAGCCGATAA 


ATATACTTTG 


GTTTTAGCAA 


6540 


ATCCTCCTTT 


TAAGGGCTCA CTTGACTACA ATTCAACCTC TAATGACCTT 


CTTGCAACCG 


6600 


TAAAAACCAA 


AAAAACAGAA 


TTACTCTTTC 


TTTCTCTTTT 


CTTGCGAACT 


TTAAAACCAG 


6660 


GTGGACGAGC 


AGCAGTTATC 


GTACCTGATG 


GTGTCCTTTT 


TGGTTCGTCT 


AAAGCTCATA 


6720 


AAGGAATTCG 


TCAGGAAATT 


GTAGAGAATC 


ATAAGCTTGA 


TGCTGTAATC 


TCAATGCCTA 


6780 


GTGGTGTGTT 


CAAGCCTTAT 


GCTGGAGTTT 


CAACTGCCAT 


TCTCATCTTT 


ACAAAAACTG 


6840 


GTAATGGTGG 


TACTGACAAA 


GTCTGGTTTT 


ACGATATGAA 


AGCGGATGGT 


TTAAGTTTGG 


6900 


ATGATAAGCG 


ACAACCGATT 


AGCGACAATG 


ATATTCCAGA 


TATTATCGAA 


CGCTTTCATC 


6960 


ATCTTGAAAA 


AGAAGCAGAA 


CGTCAGAGAA 


CGGATCAATC 


TTTCTTTGTT 


CCAGTTGCTG 


7020 


AGATAAAGGA 


AAATGATTAT 


GATTTGTCTA 


TCAATAAATA 


TAAAGAGATT 


GAGTATGAAA 


7080 


AAGTTGAGTA 


TGAACCAACA 


GAAGTCATAT 


TAAAGAAAAT 


CAATGATTTA 


GAAAAAGAAA 


7140 


TTCAAGCTGG 


CTTGGCTGAA 


TTGGAAAAAT 


TACTCAAGTA 


GGGAGGTGGC 


TGTATGAAAA 


7200 


AAGTGAAGTT 


GGGGGAAGTC 


TTATCTCTAA 


AAAAAGGCAA 


GAAAGCCACT 


GTACTTGCTG 


7260 


AACAAACAAC 


TCTAAGCCAA 


CGTTATATTC 


AAATAGATGA 


TTTAAGAAAT 


AATAATAATT 


7320 


TAAAATTCAC 


TGAAAGTTTA 


AATATGACTG 


AAGCACTCCC 


AGATGATATT 


CTGATAGCAT 


7380 


GGGATGGAGC 


TAATGCAGGA 


ACAGTTGGTT 


ATGGATTATC 


GGGAGCTGTT 


GGTAGTACAA 


7440 


TTACGGTCTT 


AAAAAAGAAT 


GAGCGATACA 


AAGAAAAAAT 


TATATCAGAT 


TACTTGGGAG 


7500 


TCTTTTTGGA 


AAGTAAATCG 


CAGTATTTAC 


GAGATCATTC 


AACAGGTGCA 


ACAATTCCTC 


7560 


ATTTAAACAA 


GAATATATTA 


CTTGATTTAC 


AATTAGAATT 


GCTAGGTATC 


GAAGAACAAG 


7620 


AGAACATTAT 


CTGTATTCTT 


AATACGATTA 


AAAGGCTTAT 


TACTAAAAGA 


AAATTTCAGT 


7680 


TAGATGAACT 


AAACTTGCTC 


GTCAAATCCC 


GATTTAACGA 


GATGTTTGGG 


GAAAATAAAA 


7740 


TATTTGAAAG 


CATTGATAAC 


TTATTTGATA 


TTATAGATGG 


TGATAGGGGC 


AAAAATTATC 


7800 


CTAAATCAGA 


TGAGTTGTTT 


AGTGAGGAGT 


ACTGTTTATT 


TTTAAATACA 


AAGAATGTTA 


7860 


CTAAAAACGG 


ATTTTCATTC 


GATACAAAGC 


AATTTATCAC 


TAAAACAAAG 


GATAAATTAC 


7920 


TTCGAAAAGG 


CAAACTTGAG 


CGTTATGATA 


TAGTCTTGAC 


AACAAGAGGT 


ACTGTTGGAA 


7980 


ATGTAGCGTA 


CTACGATGAA 


TTAATAAAAT 


ATAAACATTT 


ACGTATAAAT 


TCAGGTATGG 


8040 


TAATATTACG 


TCCCAAGACA 


CCAAATCTAA 


ATCAGAAATT 


TATTATCCAT 


GTTTTAAGGA 


8100 


ATAATAATTA 


TAGTCGAGTG 


ATATCAGGAA 


GTGCTCAGCC 


TCAGTTACCA 


ATTACAAAAT 


8160 


TAAAAAAAAT 


ACTTCTCCCC 


CTCCCCCCAC 


TAGCCCTCCA 


AAATGAGTTG 


GCAGAGTTTG 


8220 
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TAGTCCAGGT CGACAAATCA CAATTTGCTT GTGAGATAGC TATAAAAGTG TGGAGAAATA 8280 

GCTTGAAATT TAGTATAATA TAGCTAAACT ATTTGTTTAA AGTGAGAAAA AAATGGGAAA 8340 

TTTTAGCTTT CTTTTAAAAA ATGACGAATA TGAATCTTTT TCAAAACCTT GCATTGAAGC 8400 

TGAGAATATG ATTGCTACAT CAACTGTGGC TACTGCCTTT ATGGCGCGTC GTGCTTTAGA 8460 

GCAGGCTGTC CATTGGATAT ATAGTCACGA TTCATATTTA GAAGCTCCCT ATCGTGCTAC 8520 

TCTATCTTCT TTAGTATGGG ATGATGATTT TAGGGATATC GTAGATTCTG AACTCCACAA 8580 

GCAGATAGTT CTGTTGATTC GGTGGGGAAA CCATGCTGCT CATGGTGGTG AAATTAAGGA 8640 

ACGAGAAGCG ATTTTAGCTT TGCATCATTT GTATCAGTTT GTTAATTTTA TCGATTATTG 8700 

TTACAGCAAT GAGTTTGTGG AGCGTTATTT TGATGAGAAG TGCTTACCAC TTTCAGCAAA 8760 

CATCAAATAC CGAGAAACTC CACAATCTAT GATAAAGTTA CAAGACAGTT TACCAGAACT 8820 

GCCTGATTTT CATGAACAGA TGGCTGCTCA GTCCGTAGAA GTTCAAGAGA CTTATACTGA 8880 

AAAACGTGAG ACTGCAGCGC AACGGCAAGA TGTGCCTTTC CATATTGATC AATTATCTGA 8940 

GGCAGAGACA AGAAAGCTCT TTATTGATAT CGATCTCCGT TTAGCAGGAT GGATATTTGA 9000 

AGAAAACTGT. CGTGTTGAGA TAGCCGTTGA TGGTCTCAAG CACGGTTCAG GAATTGGTTA ' 9060 

CTGTGACTAT GTACTTTATG GTAAAAATGG GAAAATTTTA GCGATTGTGG AGGCTAAAAA 9120 

AGCCTCTGTC AATCCAGAAG TAGGGGAAGT ACAGGTCAAA GAATATGCTG AAGCTTTGGA 9180 

GAAACATATC GGCTATCAGC CAATTTGCTT TATTACAAAT GGGTTGAAGC ACTATATACT 9240 

TGATGGTCCG AACCGCCGCC AGATTGCAGG CTTTTACTCT CAAGAAGAAT TGCAATTAGT 9300 

GATGGATAGA CGTCATCTTC AAAAACCGCT TGAGGATATT TCTAGTAAAA TTAGGGACGA 9360 

TATTTCCGGG CGTCACTACC AAAAACATGC CATTGCAAGC GTTTGTGAAG CTTTCTCTGA 9420 

TCATCGTAGA CAGGCACTTT TGGTTATGGC AACTGGGGCG GGGAAAACTC GTACAGCAGT 9480 

TTCTCTAGTT GATATCTTAT CACGTCATAA CTGGGTAAAA AACGTTCTCT TCTTAGCCGA 9540 

TAGAACTTCC TTGGTTAAGC AAGCATATGA TTCGTTTAGA AAATTACTCC CAGATCTTTC 9600 

CGTTTGTAAC TTCTTAGAAG ATAAAGAAGG AGCTCAATCA AGTCGCATGG TCTTTTCAAC 9660 

TTATCCGACC ATGATTGGAG CGATTAGTGG TCAAGAAGAA GTAAATCAAC GCCCTTTCAC 9720 

TGTTGGGCAT TTTGACCTTA TCATAATTGA CGAATCTCAC CGTTCTATTT ATCAGAAATA 9780 

CAAGTCCATT TTTGATTATT TTGATGCAAG AATTGTAGGC TTAACAGCTA CTCCGCGTCA 9840 

AGATTTAGAT AAAAACACCT ATGGATTCTT TAATTTGGAG AATGGGGTTC CAACATATGC 9900 

ATATGATTTG GAAGAGGCTG TTAAAGACGG ATATTTAGTA GCCTATCATT CTATCGAAAC 9960 

CAAACTGAAA CTACCTACGG ATGGTCTACA TTATGATGAT TTGTCCGAAG AAGAAAAGGA 10020 
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ACATTTTGAT AGCAAATTTG AAGACAATAG CTGTGAAAAA GATATTGATG GGAGTGTATT 10080 

TAATTCCTTT ATTTTCAATA AAAGTACAGT AGAAATTGTT TTAAATGAAC TCATGACAAG 10140 

AGGAATTCAG ACAGCCTCGG GTGATGAAAT TGGTAAAACT ATTATTTTTG CTAAAAATCA 10200 

TGATCATGCG GAATATATCA GAGGTATTTT TAACAACCGC TATCCTGAAA AAGGGAGCGA 10260 

CTATGCTCAG GTGATTGATT ATAGTATTAA GCATTATCAG ACCTTGATTG ATGATTTTAA 10320 

AATTAAGGAG AAGTATCCTC AAATTGCGAT TTCTGTCGAT ATGTTAGATA CAGGTATTGA 10380 

TGTACCAGAG GTTGTTAATT TAGTCTTCTT CAAGAAAGTA CGCTCTAAAA CTAAGTTTTG 10440 

GCAGATGATT GGTCGAGGAA CCCGTCTATG TAAAGATTTA TTTGGACCTG AGCAGGATAA 10500 

GGAAAACTTC TTGGTATTTG ATTATGGGGA CAATTTTGAT TATTTTCGTG CAGATCCAAG 10560 

AGATGGAGAG GGTCGTCACA TTGTTTCGCT GACTCAGCGT TTATTTAATA TCAAAGTGGA 10620 

CTTGATTCGA GAACTTCAGG GACTCCAATA CCAAGAAGAT CAGTTTGCGA GAGCATACCG 10680 

TCAGCAGCTT GTCTCGGAAC TTCAAGGTCG TATAGAGAGC TTAAATGAGT TGGACTTCAG 10740 

GGTTCGTATG GTTTTAGATA CAGTTTATAG CTATAGGAAA TTGGAAAGTT GGCAGAATCT 10800 

AACTGCTGTT ACAAGTGAAA CCATTCAAAA AAATCTCTCT CCGCTTTTAT TTGATGAAGA 10860 

TAAAGAAGAT GAGATGGCGA GGAGATTTGA TTTGTGGTTG CTTCATATTC AGTTGGGGCA 10920 

ACTGACAGCT AAATGTTCCA CTGTTCATAT TTCCCAAGTG ATGAAGACGG CTAGAGCTCT 10980 

TTCTGCTATT GGCAATATCC CGCAGGTTTT TGAGCAGGCT GAAATTATCA GGAAAGTACA 11040 

GGAGCCTGAA TTTTGGAAAG AAGTTAACTT GTCTGATTTG GAAAAAATTC GTCTTGCTAT 11100 

TCGAGATTTA TTACAGTTTT TGGATAAAAC AGACCGTAAA CCCTACTATG TTAACTTTGA 11160 

AGATCGTATA CTCTCCACTG TTCACGAGAC CACAGCATTT TTGCAGGTCA ACGATCTTCG 11220 

GTCTTACAAT GAAAAAGTTG AGCATTATTT GAAAACTCAT CTGGATGAGG AGTCCATTTC 11280 

TAAGCTATAC CATAATAAAA AGTTGACATC TGATGATATG CTTGCACTTG AAAAATTGCT 11340 

TTGGGAAAAA TTAGGTAGTA AAGCAGACTA CCAAAGTCAT TATGAAAATA AGGCAATTCG 11400 

GAGATTGGTT CGTGAGATTA TTGGCTTAGA TAGAGAGTCT GCCAATCGTA TTTTTTCTAA 11460 

ATTTTTGTCG GATGAGAATC TTAATGCCAG GCAGATTTCA TTTGTAAAAT TGATTGTAGA 11520 

CTACATTGTA GAAAATGGTT TTTTAGAGAC GAAAGTGTTA ACGCAAGAGC CGTTTAAATC 11580 

TTATGGTTCT GTTCAACTAC TCTTCCAACA CCAACTACCA GTACTTCGTA ATATTGTTCA 11640 

AATCATTGAA CTTATCAATA ATCGAGCTGG AGAAGCGGCT TAAATTCTAA AGTGATTGCC 11700 

ATGCTGAGAC TCATTTAAAA TTAAAAAGAG TAGAAATTTA TGCTATATAT GAGAAGTTTT 11760 
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ATTAGGAAGA ATGTCATCGT TTTCCTAGAA TACAGTATCA GTTGTTAAGT GGTTGATAAA 11820 

TTTCAAAGTA GATACTTGTA CCACGATGTT TGTTGATCGA GTTATTAACA AAAGAGCTAC 11880 

TTTGATTTTA AAGAAATAGA AAACAAAAAG CCGAGCAAGA ATTCAATTGC AGGAGAAAAT 11940 

GAAATAATAC TCAATGAAAA TCAAAGAGCA AACTAGGAAA CTAGCTGCAG GCTGCTCAAA 12000 

ACACTGTTTT GAGGTTGCAG ATGGAAGCTG ACGCGGATTG AAGAGATTTT CGAAGAGTAT 12060 

AAATCTTCCT AGGATAAAGC AAAACGCATA GTATCAAGGG TTTTCAACAC TTGATACTAT 12120 

GCGTTTTCTG ATGTTAAAGA CTTTCTACCA GGTTTTTTAA AAGCATAATT GTTAGTTGTA 12180 

GTCATTTATT ATTCTTCAAA GAAAAATGGT GGGGCGAATT TTTTCAGTTC TTCAAAGCAC 12240 

TTTTGAGCAG TATCTGCATC TTCACAGATG ATAAGACAGA CATCATTACC ACAAAGGGTA 12300 

GCGATAGCGT CAGGGAAGCT CAAAGTATCA ATGATAGAAC CAAAGGATTG AGCCAGTCCA 12360 

GGAAGGGTTT TTAGTAGGAC TTGGTGTTGA ACTGGGCGCA TCCAGACAAG GGCGTCTTCC 12420 

ATGTAGAGTT CGAGACGTTT TTCCCATTTT GAGATGGAAC CATTGTTAAG AACATAATAA 12480 

GCGCTATCTT CTTCGCGGAC TTTTGATAGG TTCATATTTT TGATGTCGCG TGAGAGGGTT 12540 

GCCTGGGTTA CTTGAATGTC GTTCTCAGCA AGAAGGGCTT GCAACTCAGC CTGTGTATGA 12600 

ATCTTGTTTT TTGTGATAAG AGCGCGTATA AGTTGGTGGC GGTGTTCTGA TTTATTCATA 12660 

ATAATGTAAC TCCTTTTAGC AAGGTAAGGT AAGCATGGAC TGAGCGAGGT CGACAGTCAA 12720 

GTGGTAGTCT GTATTGTCAC GGATGGTGAT TTCAAAGTCA GTAGTATAGA GGACTAAACG 12780 

GAGAGTGTCT CCTTCTTTTA GCTTGTAAAT AGTTGGCTGC AGTTCAAATT GAACGTCCAT 12840 

CCATTCATCT GCAGTAATAT CCTCTACTAA CAGTAAATCA TTTCTATTTT GTAAATTAAG 12900 

GTAACCTTTT GTCACGACTC GTTGTGCCTC TGGTCTAAAT GGCAATTCAC AGAGATTTTC 12 960 

CAACATGTGA TAGCGACCGT TGTCAATGGT TCTAGCACTT AAAATAGCTG GATAAGGTTG 13020 

TAGGTATTTC TTTTGCCCAA ATTCTAGCAG TTGGGCAGAT AAGAGCCCCT TGTTTGTACT 13080 

GGATTTGATA CGAAGATTGA GCTGAGCGCG ACCGTTTAGG TGGAGATCTT TAGTCACAGG 13140 

AAGGTTAATA GTAATCTGAT TGGCTTTCCC TTGATAGAGC TCTGTATTGA AGGTTTGGTA 13200 

TGTCTTACCA TAGCGCTCAA AATCCTTATC TGGGTACTGG TTTTGAATAG CTTGCTCTTC 13260 

TTGACCAAGT GAGAAGGTTT CACAGTTTTC TTGCCCACCG AAGTTATCAA GTGATAACCA 13320 

AGTCTGTGGA GCTGTATTGT CCTGCCAGAT AACAGTAGGA AGTTGAAAGT CTGTTTCCTG 13380 

TCCTAGTAAT TTCTTGGTCA ATAAGGCATT TATGGACTCA CGGAAGTCAA TTGATTGCCA 13440 

ATTGTTCATG TAAACATGGG CACCATTATG GAAAAAGAGA TGCTTGTGTA TATGAGTAGG 13500 

AAGAGCATGG AACATCTGGT AAACATGAAG TGGTTTGACA TTCCAATCCT GAGAACCATG 13560 
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AGTAAAGACA ACCTCTGCCT TTACTTTATG GGCATTGAGC AGATAATTGC GGTCATGCCA 13620 

AAACTGATTG TAGTCCCCAG TTTTTCGGTC TAGCTGAGCT TTCACTTTTT CTAAGTCAGC 13680 

TTGGTGAGCT TCATTGCCAC GGATATAGTC GCCAGCTAAG AGATTACGAG AATAGGTTAA 13740 

CTCAGCAAGG GAGTCAAAGT CCTCACCTGG ATAACCACCT GGGCTAGTCA CCAGACCGTT 13800 

TTCACGGTAG TAGTTGTACC ATGATGAAAT TCCTGCCTCG GCAATGATAA CTTCTAAACC 13860 

ATCGACTCCT GTAGTCGCAA GACCATTGGA CATGGTACCT AGATAGGAAA GTCCTGTTGT 13920 

AGCAACTTTT CCGTTTGACC AATCAGCCTT GACTTGACGC TGGCGCGTGT GATCAGTAAA 13980 

GGCACGGCAA CGACCGTTAA GCCAATCGAT GACATTTTTA TAAGCCTCGA TTTGCTGGTA 14040 

GTCTCCATTA GTCATGAAAC CTGTCGAGTC TTTGGTACCA ACACCTGAGA CATAGAGATT 14100 

GGCAAAGCCT CTCGGAAGGA AGTAGTCGTT TAGTGTATAG CTAGAGTTGA TGTGAGTTAG 14160 

CTTTTCCTCA GCCTCTGCTA TAAGCTCAGC TTTACCTTGG GGTTGGACGA GATTTAGTTG 14220 

AGGTTTCTCT AGCTCAATCT TGTGAGGAAG CTTAACCTCA AGCTCGCCCT CCATCTTGTA 14280 

GAGAGCCTTG TCACTAGCCT TGTCATTGGT TCCCTGATGA TAAGGGCTGG CTGTCATGAT 14340 

GGCAGGGATT TTTCCATCAA AACGAGGGCG AATAATGCTA ACCTTTACTA GGTCTGATAG 14400 

CCCTTTTTGG TCAGTATCGA CACGAGACTC AACGTAAACG ACTTCACGAA TGACATCCTG 14460 

GTTAGAAAAA GTAGCCAAAC TCTTGCCGTT AAAGTAGTGG TAGTCATTAT CCTCCGGAAT 14520 

AAGACCATCA CTAACAAGTT GGTCGATAAG AGTATTTCCT TTTTTGGTGC GAGTATTGAG 14580 

TAACTGATAG AGATTTTCAA TCAAGTCACC ATATATAATG GGAAATCCAG TTTCTTTACG 14640 

AAAAACGTCA CTATCTTCGA AGTCAACCAA ATAAGAAAAG CCTAAAAGTT GAAAAGCAAC 14700 

AGTATAAAAA ATATCTGCTG TCAGTTCATC TTCTGATTGA AAAAATGTCA GCAGGTCTGT 14760 

TTTTTTATCA GCTGCTAGGA TAGAAAGTGG GTAGTTGGTG TCTTGATAAG TGAAAAAGAA 14820 

ACGACGTAAA AAGGTTTCAA GTGAGTCTTT GTGATTGGCT GTATTTTGTA AATCAAAGCC 14880 

ACATTTTTTT AGTTCAGATA AGACATTTTC TTTTGGAAAA TTGATATAAC TATATTGATT 14940 

AAAACGCATA GAACCTCCAT ATAGAATGAC AGTTAAGGTT ATTATATCAA AAAAAAAGCA 15000 

GAAAGGGAAT TGTTAACTTC AAAAGGAAAT AATCCAATAA AAATGAATAA AGTACTAAAT 15060 

TCAATATAGA GAACAGAGTA ACAATAAGAA TAAATAGATA GGGTATAAAA GTTCTAGGAG 15120 

ATTTATATTA TATGCTTTCT ATTTTTATAT ACAATATAGT ATAAATATAA AAATGATGAC 15180 

AAAAATACAA ATGAATAGAA AATAAATTAG TAAGCTGATG AAATTTTTCT CAAGAGAAGC 15240 

CATTTATAGG TGAAAATGGT ATAATATAGT GAGAAGGATA GAGGAGAAGT GTAAATTGAT 15300 
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CGCACAACTA GATACAAAAA CAGTCTATAG TTTTATGGAA 


AGCGTCATTT 


CGATCGAAAA 


15360 


GTATGTGAGA GCAGCTAAAG AATACGGCTA CACTCATTTG 


GCTATGATGG 


ATATTGACAA 


15420 


TCTTTATGGC 


GCTTTCGACT TTCTAGAGAT TACAAAAAAA 


TACGGCATTC 


ATCCTTTGCT 


15480 


AGGGCTTGAA 


ATGACAGTGT TTGTAGATGA TCAGGGAGTG 


AATTTGCGCT 


TTTTAGCTCT 


15540 


ATCTAGTGTG 


GGCTATCAGC AGTTGATGAA GCTTTCGACA 


GCCAAGATGC 


AGGGGGAGAA 


15600 


AACTTGGTCA GTCCTGTCCC AGTACCTGGA GGATATCGCG 


GTCATTGTGC 


CTTATTTTGA 


15660 


TAGAGTTGAG 


TCGTTAGAAC TAGGCTGTGA TTACTATATA 


GGGGTTTATC 


CAGAAACACT 


15720 


AGCAAGCGAA 


TTTCATCATC CTATCTTACC TCTTTATCGG 


GTCAACGCTT 


TTGAAAGCAG 


15780 


GGATAGAGAA 


GTTCTTCAAG TTTTAACAGC GATTAAAGAA 


AATCTACCGC 


TCAGAGAAGT 


15840 


TCCCTTGCGT 


TCGAGACAAG ATGTCTTTAT ATCAGCAAGT 


TCTTTAGAGA 


AACTATTCCA 


15900 


AGAGCGTTTT 


CCGCAAGCTT TGGACAATTT AGAAAAGCTT 


ATTTCAGGCA 


TTTCTTACGA 


15960 


CTTGGATACT 


AGTCTGAAAC TGCCTCGTTT TAATCCAGCT 


AGACCAGCAG 


TAGAGGAGTT 


16020 


GAGAGAGCGT 


GCTGAACTGG GGCTTGTTCA GAAGGGGTTG ACTAGTAAAG 


AATATCAAGA 


16080 


TAGACTAGAC 


CAAGAATTGT CTGTTATTCA TGATATGGGC 


TTTGATGATT 


ATTTCTTGGT 


16140 


TGTTTGGGAT 


TTGTTGCGTT TTGGACAATC GAATGGCTAT 


TATATGGGAA 


TGGGAAGGGG 


16200 


1 1 1- i bLAvjTA 


GGCAGTTTGG TTTCTTATGC CTTAGACATC 


ACGGGGATTG 


ACCCAGTAGA 


16260 


GAAAAATCTG 


ATTTTTGAAC GCTTTCTTAA TCGTGAACGC TATACCATGC 


CTGATATTGA 


16320 


TATTGATATC 


CCAGATATTT ATCGTCCAGA TTTTATCAGA TATGTTGGTA ATAAATATGG 


16380 


TAGTAAACAT 


GCGGCACAAA TCGTTACTTT TTCAACCTTT 


GGAGCCAAGC 


AAGCTCTTCG 


16440 


AGATGTCTTG 


AAACGCTTTG GTGTGCCAGA GTATGAATTA 


TCTGCAATTA 


CTAAGAAAAT 


16500 


CAGTTTTCGT 


GACAATCTTA AGTCGGCCTA TGAGGGAAAT 


CTCCAGTTTC 


GTCAGCAAAT 


16560 


CAATAGTAAG 


TTAGAATACC AAAAAGCTTT TGAGATTGCT 


TGCAAGATAG 


AGGGCTATGC 


16620 


AAGGCAAACC 


TCTGTCCATG CGGCTGGTGT TGTAATTAGT 


GACCAAGATT 


TAACCAACTA 


16680 


CATTCCTCTA 


AAGTATGGTG ATGAAATTCC ACTGACTCAG 


TATGATGCTC 


ATGGAGTTGA 


16740 


GGCTAGCGGA 


CTTTTGAAGA TGGACTTTCT GGGACTACGA 


AATTTGACCT 


TTGTCCAGAA 


16800 


GATGCAAGAG 


TTGCTTGCTG AAACAGAAGG TATTCATCTG 


AAAATTGAAG 


AAATCGATTT 


16860 


AGAAGACAAA 


GAAACGTTAG CTTTATTTGC CTCTGGTAAT 


ACAAAAGGTA 


TCTTTCAATT 


16920 


TGAGCAACCA 


GGTGCCATTC GTCTGCTTAA GCGTGTGCAA 


CCAGTCTGTT 


TTGAAGATGT 


16980 


CGTCGCGACT 


ACTTCTCTAA ATCGACCGGG TGCTAGTGAC 


TATATCAATA 


ATTTTGTGGC 


17040 


AAGAAAGCAT 


GGGCAGGAAG AAGTGACTGT TCTGGATCCA 


GTACTGGAGG 


ATATTTTGGC 


17100 
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TCCAACCTAC 


GGCATAATGC 


TCTATCAGGA 


GCAGGTTATG 


CAGGTTGCCC 


AGCGACTTGC 


17160 


CGGATTTAGT CTTGGGAAAG CCGATATTTT GCGTCGGGCT ATGGGGAAAA 


AGGATGCCTC 


17220 


TGCCATGCAT 


GAGATGAGGG 


CTTCCTTTAT 


TCAAGGTTCA 


TTAGAAGCTG 


GTCATACTGT 


17280 


GGAAAAAGCA GAGCAGGTCT 


TTGATGTTAT 


GGAGAAGTTT 


GCAGGTTATG 


GTTTTAACAG 


17340 


GTCACACGCC TATGCCTACT CAGCCTTGGC CTTCCAGTTG GCTTATTTCA 


AAACGCATTA 


17400 


TCCAGCCATT 


TTTTATCAGG 


TCATGTTAAA 


TTCTTCCAAC 


AGTGATTACT 


TAATAGATGC 


17460 


ACTTGAAGCA 


GGTTTTGAAG 


TAGCCTCTCT 


ATCCATCAAC 


ACCATTCCCT 


ATCACGATAA 


17520 


AATTGCCAAC 


AAGGCCATCT 


ATCTAGGTTT 


GAAATCCATT 


AAAGGAGTCA 


GTAATGATTT 


17580 


AGCTCTCTGG ATTATTGAAA ATAGACCTTA TTCTAACATT GAAGATTTTA 


TAGCTAAATT 


17640 


ACCTGAGAAT 


TATCTGAAAC 


TTCCTCTGCT 


AGAACCTTTG 


GTAAAAGTTG 


GTCTTTTCGA 


17700 


TTCATTTGAA 


AAAAATCGTC 


AAAAAGTATT 


TAATAACTTA 


GCTAATCTAT 


TTGAATTTGT 


17760 


GAAAGAGTTG 


GGAAGTTTGT 


TTGGAGATGC 


TATTTATAGT 


TGGCAGGAAT 


CGGAAGATTG 


17820 


GACGGAACAA 


GAAAAATTTT 


ATATGGAACA 


AGAGCTTTTA 


GGGATAGGTG 


TCAGCAAACA 


17880 


TCCACTACAA 


GCTATTGCAA 


GTAAGGCTAT 


TTACCCGATT 


ACCCCAATCG 


GAAATTTGTC 


17940 


AGAAAATAGC 


TATGCTATTA 


TCTTGGTTGA 


AGTTCAGAAA ATAAAAGTGA 


TTCGTACCAA 


18000 


AAAGGGTGAA 


AATATGGCCT 


TCTTACAGGC 


AGATGATAGT 


AAGAAAAAAT 


TGGATGTCAC 


18060 


TCTCTTTTCA 


GACTTATATC 


GTCAGGTTGG 


ACAGGAAATA 


AAAGAGGGAG 


CCTTCTACTA 


18120 


TGTAAAAGGA 


AAAATACAAT 


CACGTGATGG 


CCGTCTGCAA 


ATGATTGCAC 


AAGAAATAAG 


18180 


AGAAGCAGTT 


GCTGAACGCT 


TTTGGATACA 


GGTGAAAAAT 


CATGAATCGG 


ATCAAGAAAT 


18240 


TTCACGCATT 


TTAGAACAAT 


TTAAAGGCCC 


AATCCCAGTC 


ATCATCCGGT 


ATGAAGAGGA 


18300 


ACAGAAAACC 


ATCGTTTCTC 


CCCATCATTT 


TGTAGCTAAA 


TCCAATGAAT 


TAGAGGAGAA 


18360 


ATTGAATGAA 


ATCGTTATGA 


AAACGATTTA 


TCGCTAAAAA 


TACGGAAAAT 


AGAAGAATTT 


18420 


TCAACGTAAA 


TGTGGTATAA 


TCAGTAAGAA 


TGTTAAAAGA 


AAAAGGAGCA 


TAACCAATAT 


18480 


GAAACGTATT 


GCTGTTTTGA 


CTAGTGGTGG 


AGACGCCCCT 


GGTATGAACG 


CTGCCATCCG 


18540 


TGCAGTTGTT 


CGTCAAGCAA 


TTTCAGAAGG 


AATGGAAGTT 


TTTGGTATCT 


ATGACGGATA 


18600 


TGCTGGTATG 


GTTGCCGGTG 


AAATTCATCC 


CCTAGATGCA 


GCTTCAGTAG 


GGGACATCAT 


18660 


TTCTCGTGGT 


GGTACTTTCC 


TTCACTCAGC 


TCGTTACCCA 


GAGTTCGCTC 


AACTTGAAGG 


18720 


GCAACTTAAA 


GGGATTGAGC 


AATTGAAAAA 


ACACGGAATT 


GAAGGTGTAG 


TTGTTATCGG 


18780 


TGGTGACGGA 


TCTTACCACG 


GCGCTATGCG 


TTTGACTGAA 


CATGGCTTCC 


CAGCTATTGG 


18840 
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TCTTCCAGGT ACAATCGATA ACGATATCGT TGGTACTGAC TTTACAATCG GTTTTGACAC 18900 

AGCGGTTACT ACTGCCATGG ACGCTATCGA TAAGATTCGT GATACATCAT CAAGTCACCG 18960 

TCGTACTTTT GTAATCGAAG TTATGGGACG TAACGCTGGT GATATCGCTC TTTGGGCTGG 19020 

TATTGCAACT GGTGCTGATG AAATCATCAT CCCTGAAGCA GGCTTCAAGA TGGAAGATAT 19080 

CGTAGCAAGC ATCAAAGCTG GTTATGAATG TGGTAAAAAA CACAATATTA TCGTCTTAGC 19140 

TGAAGGTGTG ATGTCAGCGG CTGAATTTGG TCAAAAACTT AAAGAAGCTG GAGATACAAG 19200 

CGACCTTCGT GTAACAGAAC TTGGACATAT TCAACGTGGT GGTTCTCCAA CTGCGCGTGA 19260 

CCGTGTTTTG GCGTCACGTA TGGGTGCACA TGCTGTTAAA CTTCTTAAAG AAGGTATCGG 19320 

TGGTGTTGCG GTTGGTATTC GTAACGAAAA AATGGTTGAA AATCCAATTC TTGGTACTGC 19380 

AGAAGAAGGG GCATTGTTTA GCCTTACTGC AGAAGGTAAG ATTGTGGTTA ACAACCCAGC 19440 

TACAAA 19446 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 16593 base pairs 
(BJ TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



TCGTAAATAT 


GCTCTGTTTT 


TGGATTTTGT 


TTCTTAATCT 


GTTTGGCAAG 


TGCCTTCATC 


60 


ATAGAAATAG 


GACCACACAT 


ATAGACGGTT 


GCATGTTCGG 


GCACTTCTTT 


TTGTTCAAAA 


120 


TTAAGATAGC 


CGTCTTTCGT 


ACTGTCGATT 


AGATGGAGTT 


CAAAATTAGG 


ATTTTTCTGA 


180 


GCATAGTTAC 


GGAGTAAATC 


TAGGTAGACT 


GCATTTTCAT 


CTCCACGGAA 


GCTATAGTAG 


240 


AAGTGAACCT 


GTTTATCTAA 


AATAGGATGT 


TCACGGATGT 


AAGAGATGAA 


GGGGGTGATC 


300 


CCAATACCTC 


CAGCAATCCA 


AACCTGATTT 


TCTCGTCCTT 


CTTCTATGAT 


CATGTGTCCG 


360 


TAAGCTCTGT 


CTAGGGTTAC 


TTTGCTGCCG 


GCTTGAAGAT 


TATCATAGAT 


ATTCTTGGTA 


420 


TGGTCGCCTG 


AAGTTTTAAC 


AGTAAAGTAA AGAGTTTGAC 


CATGACCTCC 


TGAGATAGAA 


480 


AAGGGATGCG 


GAGCACTTTC 


AAAGCCTTCT 


TGGAAAATCT 


TTAGAAAGGC 


AAATTGTCCT 


540 


GATTGATAGT 


TGAAAGGTCT 


GCTAAGATGG 


ATTTGAATTT 


CTCTAGTATC 


GTGATTTAAG 


600 


CGTTTGAGAT 


GGGTAATTTT 


CCCTAGATAG 


GGGAAGGAAA 


TCTTTTGATA 


TAGAAAAATG 


660 


ATATAAAAAC 


CAGCTAGTAA 


GCCTAAAAGG 


GCATAGCTAC 


CAACAAGAAA 


ACTTAGAAGA 


720 


TTAAATGTAA 


GGAGACGATT 


GCCCATTATC 


ATGTAGATGT 


GAAAGAGTCC 


TAAAATATAG 


780 
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GCTAGGTAAA CCAGGCGGTG AATCCATCGC CAAGCTTCGT ATTGGATGTA TTTGCCTAAA 


840 


TAGGCGACAA 


GGATGATGCT GGCAAAGATA TAGATGGCAA GATTGCCAAA CTGAGCAGCT 


900 


AAGCGAGAGC 


CCCACAAACC GCCCATACTA AAGTTATGAA AGATTAGTAG GATGATTGAG 


960 


AGAAAGGCTG 


TGAATTTGTG GACGGTGTAG ACCTTCTCCA AACTGTGAAA CCAGCTTTCT 


1020 


AGTAGTGGGA 


GACGAGTGGC TAGGATAAAA GTCAGAGATA GGCTTGTTAA AGCTAGTCCT 


1080 


GGAATCATGA 


ATTGGGGAGA AGTGTTCATC CAAGTCAAAA GAGTCAAGAT AAAACTAGCT 


, 1140 


ATGATAAAGA 


GTAGTCCTTT GACTGATTTC ATAGAAAATT CCATTTCATT TAGATTTCGA 


1200 


TTTGTTGTAA 


ATAAATTTGT TACATTTTAT CATAGAAAAT GTATGGTGTC AAATTGAGGT 


1260 


CTATAAATAT 


CTACTCTCAT CAAAAAACTC TCCAATTGAA CTGGAGAGTG GCTGTTTATA 


1320 


CTCAATGAAA 


ATCAAAGAGC AAACTAGGAA GCTAGCCGCA AGTTGCTCAA AACACTGTTT 


1380 


TGAGGTTGCA 


GATAGAGCTG ACGTGGTTTG AAGAGATTTT CGAAGAGTGT TATTCTGCAG 


1440 


CTTGTTGCCA 


ACGTTTGGCT AGCATATGAG ACAGGCTAGA AATTGCTAGG TTAAAGCTGA 


1500 


AGTAGATGAG 


GGCAATCAGG ATGTAAAGAC TGAAGACCTG CTCTGGTTCG AAATAACGGG 


1560 


CCATGAGAAT 


TTGGCTGGCT CCAAAGAGTT CTTGTAGGGC GATAACAGAG TAGAGGAGAC 


1620 


TGGTATCCTT 


AATCACGGTA ACAAACTGAG AAATGATGGC TGGTAGCATT TTGCGGATGG 


1680 


CTTGTGGGAG 


AATGATGTAG TAGAGGATTT GGGCTGAGGT GAAGCCTTGT GACATTCCTG 


1740 


CTTCGTACTG 


TCCCTTGTCT ACGGCATTGA GACCGCCTCG AATAATCTCA GCCAAGGCTG 


1800 


CTGATGTAAA 


GAGAGTAAAG GCTGTAATAC CTGCTGGTGT GGATTTCATT TTGAACACCA 


1860 


AAAAGATAGT 


AAAAATCCAG AGAAGGTTGG GAACGTTGCG CACAAACTCG ATATAAATAG 


1920 


TGGAAATAAT 


GCGTAAGACA GGATTTTTGC CATTTCTCGT GACAGCTAGC ACCGTACCGA 


1980 


TGATAGTAGA 


GAGGATGATG GCAATCAGAG AAATATAGAG GGTCAAGCCA AATCCTTTAA 


2040 


AGATAAAGAC 


TAGGTTATCT GGGGTTAAAA CTTCTAAAAT AGATTCCATA GTAACCTCCT 


2100 


AAAGTGAATA 


GGCTTTTTTG TTGGCTTGCT CCATCTTGCG ACCAAACTGG GCAACAGGGA 


2160 


AGCATAGAGC 


AAAGTAGAGA AGAGCAGCAC CTAAAAAGGC TGGTATATAG TTTCCGTTGA 


2220 


GAGCCGACCA AGACTTAGTC ACAAACATCA AGTCTACTCC AGAGATGATA GCTACAGTAG 


2280 


AGGTGTTCTT 


GATGAGGTTA ACAATTTGGT TGGTCAATGG AGGGAGAATG ATGCGGAAGG 


2340 


CCTGAGGCAA 


GATAATCAAG CGCATGGCAC TGATATAGGT AAAACCTTGC GACAAGGCGG 


2400 


CCTCCATCTG ACCACTAGGA ATAGACTGAA TCCCTGAACG AATAACCTCA GCGATATAAG 


2460 


CGCCGTGATA 


GAGTCCCACG CAGAGAACGG CTGTCCAATA AATTGGAATC ATGATGATAT 


2520 
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GGTCACTGAT AAGAGGTAGG CCATAAAAAA CAATAACAAA CTGCACCAAG AGGGGAGTAT 2580 

TTTGGTAAAA TTCAACAAAG ATGCGAGCTA AAATGCGTAA AATTGGACGT TTACTGGTTG 2640 

ACATGGCACC AAAGAAGATG CCCAAAACCA TAGCGAGGAT AAAGGAACCA ACCGCTAGGG 2700 

CAAGGGTGAA GAGGAAACCA TTGAAAAATT GTCCAAAATC CTGAAAATAG GCTGTCCAAG 2760 

ATGATAAATC TGTCATGGGG TGTCCTCCTT AATCTGCAGT ATGGCTAGAT GGTTTGAGCT 2820 

TGTAACGGTC ATAAAGTTTC TGCAAACTAC CATCCTTGCT CCATTTAGTA ACCAAGTTAT 2880 

CAAGATAGTC GTTGAGCTCT GTATTTGATT TCTTGGTAAC AATACCGTAG TCAGATGGCT 2940 

TGAAACTATC ATCTAGTAGT GCTGTCCGTT TACTAGTGTA GCCAGATAGA ATAGAGCGGT 3000 

CAACGGAAAA GGTATCGATA CGATGAGCGT GCAGGGAAGT AATCAATTCT GGGTAGGAAC 3060 

CAAGTTCGAC GAATTTAAAC TTCAGACCTT TCTTTTTACC CAGTTCAGTA ATCAGGCGTT 3120 

GGGTGATAGA ACCTTGGGCG ACTCCGATGG TTTTGCCGTT TAGGTCCTCA ATCTTTTTGA 3180 

TTTTGGCAGA TTTATTGACC AAAAATCCAG AAGCGTCTGT GTAGTAGGGA CTGGTAAAGT 3240 

TGTAGAGTTT TTTGCGTTCG TCCGTGATGG TAAAGGTCGC GATATCGATA TCGACCTGTT 3300 

CATTGTCTAG AAGGGGGCCG CGGGTTTGTG CTGTAACCGG CACATAGCGA ATCTTGACCT 3360 

TGAGTTCATC AGCTACCATC TTGGCCAAGT CGGTTTCGAT ACCAGAATAA GTACCGGTCT 3420 

TGGGATCTTT GTAACCAAAA TTGGGAACGT CTTGTTTGAC ACCGACAACC AGTTCGCCTC 3480 

TTTTTTGAAT GTCTGCGATA CTTGTATCAG CCTGGACTGG TTTGGCAGCA GCAAGGCCGA 3540 

AAAGGCTAAT CAATAATGCT GATAAAAAGA ATTTTTTTTC ATAGGCGCGT CCTTATTTGA 3600 

CTTTGTCACT TTCGTGGTTG ATAATTTTGC TGAGGAATTG TTGGGCACGA GGTTCGCTTG 3660 

GATTGTCAAA AAAGTTATCG ACATCTGTCG TATCTACTAA AACTTCTCCG TCGGCCATAA 3720 

AGATAATGCG GTCCGCAACC TCTCGAGCAA AGCCCATTTC GTGGGTAACG ATGATCATGT 3780 

TCATCCCATC ATGCGCCAGT TTCTGCATAA CTGCTAGAAC ATCTCCGATA GTCTCAGGAT 3840 

CAAGAGCAGA TGTTGGTTCA TCAAAGAGGA GGAGTTCCGG ATGCATAGCA AGACCACGAG 3900 

CGATGGCGAT CCGCTGTTTT TGTCCACCAG ATAGCATGGC GGGATAGGAA TCTTTCTTGT 3960 

CCCACATATT TACAAATTCC AGATATTTTT GGGCGGTTTT TTCAGCTTCT TTTTTATCAA 4020 

TTCCTAGAAC TTCAATGGGT GCAAGCGTTA CGTTTTCTAA CACAGCTTTG TGTGGATAAA 4080 

GGTTAAAATG TTGAAAAACC ATGCCGACTT CCTTGCGAAG AGGTACCAAA TCTTTCTGGC 4140 

TGGCACCAGC AACTTGGTGC CCATTGACTA GGAGACTTCC TTTGTCAACA GTCTCTAAAC 4200 

CATTGATCGT ACGGATAAGA GTGGACTTCC CAGAGCCAGA AGGTCCAAGC AGGACAACAA 4260 

CTTGTCCTTT TTCAAAACGG AGATTGATGT TGCGGAATGC GTGGTAGTCT CCGTAATATT 4320 
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TTTCGACGTT TTTAAATTCT ACTAAAGCCA TGAGAGATCT CTATTGTGTT ATATTTTATA 4380 

ACACGGTTCT ACAATAAAAG AATGTTCTTG TCAAATCATA TCTGAAAAAA TTCACTATAG 4440 

TGAAATAAGA ACAGGAAAAA TCGATCGGGA CAGTCAAATC GATTTCTAAC AATATTTTAG 4500 

AAGTAGAGGT GTACTATTCT AGTTTCAATA TACTATAAAA TGTTATAAAA AAGCAATCTG 4560 

GATAGAGAAA ACGTCTAAAT CATGTTATAA TGAAGCAATA GAATTCTTAG AAAGAGTGGA 4620 

TGTCTTTTTG ATAACACCTA CTTATGAATG GCAGTTTGCG CTGCAGGTAG AAGATGCGGA 4680 

TTTTACAAAG ATAGCCAAGA AGGCTGGACT GGGTCCTGAG GTGGCTCGGT TATTGTTTGA 4740 

GAGAGGGATT CAGAACCAAG AAAGTCTGAA GAAGTTTTTA GAACCTTGCT TGGAGGACTT 4800 

ACATGATGCT TATCTGCTCC ATGATATGGA CAAGGCAGTG GAGCGGATTG GTCAGGCTAT 4860 

TGAAGAAGGG GAAAATATTC TTGTTTATGG AGACTATGAT GCGGATGGCA TGACTTCGGC 4920 

TTCTATTGTG AAGGAAAGTT TGGAACAACT TGGTGCTGAG TGCCGAGTTT ACCTGCCAAA 4980 

TCGTTTTACC GATGGCTATG GCCCTAATGC TAGTGTTTAT AAATACTTTA TCGAGCAAGA 5040 

AGGGATTTCC TTGATTGTGA CGGTGGACAA TGGGGTTGCT GGTCATGAGG CTATTGCATT 5100 

GGCTCAGTCT ATGGGAGTAG ATGTCATTGT GACAGACCAT CATTCCATGC CTGAAACCCT 5160 

GCCAGATGCT TATGCTATTG TCCATCCTGA ACATCGAGAT GGGGATTATC CTTTTAAATA 5220 

TTTGGCTGGT TGTGGAGTTG CTTTCAAGTT GGCTTGTGCC. CTGTTAGAAG AAGTGCAAGT 5280 

GGAATTGCTT GATTTGGTCG CTATTGGAAC TATTGCAGAT ATGGTGAGTC TGACGGATGA 5340 

AAATCGTATC TTAGTTCAAT ATGGTCTGGA AATGTTGGGT GATAGCCAGC GCATTGGTGT 5400 

GCAAGAAATG CTGGACATGG CTGGGATTGC TGCCAACGAA GTAACAGAAG AAACGGTTGG 5460 

TTTCCAGATT GCTCCTCGTT TGAATGCCTT GGGTCGCTTG GATGATCCGA ATCCTGCCAT 5520 

TGATTTGTTG ACTGGATTTG ATGATGAGGA AGCGCATGAG ATTGCCCTTA TGATTCACCA 5580 

GAAAAACGAA GAGCGCAAGG AAATCGTTCA GTCTATCTAT GAAGAAGCCA AGACCATCGT 5640 

GGATCCTGAG AAGAAGGTTC AGGTCTTGGC CAAGGAAGGG TGGAATCCTG GGGTTCTAGG 5700 

AATCGTGGCT GGTCGTTTAT TGGAAGAATT GGGACAGACA GTCATTGTTC TTAATATAGA 5760 

AGACGGTCGT GCCAAGGGCA GTGCTOGTAG TGTGGAAGCG GTCGATATTT TTGAAGCTGT 5820 

GGATCCCCAT CGAGACCTCT TCATCGCCTT TGGAGGTCAT GCAGGTGCAG CGGGTATGAC 5880 

GCTGGAAGTT GAGCAACTCT CAGATTTATC TCAGGTTTTG GAAGATTATG TTCGTGAAAA 5940 

AGGTGCAGAT GCTGGTGGCA AGAATAAGTT AAACCTAGAT GAAGAGTTGG ATTTGGAGGC 6000 

ACTTAGCTTG GAAACGGTCA AAAGTTTTGA ACGTTTAGCT CCTTTTGGAA TGGATAATCA 6060 
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GAAACCTATT 


TTTTATATCA AGAATTTTCA 


GGTCGAAAGT 


GCTCGTACTA 


TGGGGGCAGG 


6120 


TAATGCCCAT 


CTAAAGCTGA 


AAATTTCCAA 


GGGTGAGGCG 


AGTTTTGAAG 


TGGTAGCCTT 


6180 


TGGTCAAGGC 


AGATGGGCGA 


CAGAGTTTTC 


TCAAACCAAG 


AATCTAGAGT 


TAGCGGTTAA 


6240 


ATTGTCTGTC 


AACCAATGGA ATGGCCAAAC 


TGCCCTCCAG 


TTGATGATGG 


TGGATGCGCG 


6300 


AGTGGAAGGT 


GTTCAACTTT 


TTAACATTCG 


TGGAAAAAAT 


GCAGTCTTGC 


CAGAAGGTGT 


6360 


TCCAGTCTTG 


GATTTTCCTG 


GAGAACTGCC 


AAATCTTGCG 


GCTAGTGAAG 


CTGTTGTCGT 


6420 


AAAAAACATT 


CCAGAGGATA 


TTACTCAGCT 


GAAGACCATT 


TTTCAGGAAC 


AGCATTTCTC 


6480 


TGCTGTCTAT 


TTCAAAAATG 


ATATTGACAA 


GGCTTATTAT 


CTGACAGGTT 


ATGGGACTAG 


6540 


AGATCAGTTT 


GCCAAATTGT 


ACAAGACTAT 


TTACCAGTTC 


CCAGAGTTTG 


ATATTCGCTA 


6600 


CAAGCTGAAA 


GATTTGGCTG 


CATATCTTAA 


TATTCAACAA 


ATCTTGCTGG 


TCAAGATGAT 


6660 


TCAAGTATTT 


GAAGAACTAG 


GCTTTGTGAC 


GATAAAAGAT 


GGTGTGATGA 


CAGTCAATAA 


6720 


AGAGGCGCCA 


AAGCGGGAGA 


TAGGAGAAAG 


TCAAATTTAC 


CAAAATCTCA 


AACAAACCGT 


6780 


TAAAGACCAA 


GAAATGATGG 


CGCTGGGTAC 


GGTGCAAGAA 


ATTTATGATT 


TTTTGATGGA 


6840 


AAAAGAGTAG 


AAGTTAGGAA 


AGAGTTGGGA 


AATCAACTCT 


TTTTTGAAAA 


CAGACCTTCA 


6900 


TTTTGAAAAT 


CATCAAAAAA 


ATGGTATAAT 


GGTAGGAAAA 


GATTCGGCTG 


AAAGTATCAG 


6960 


AACTTTTAGA 


ATAAGAGGGT 


AGAATTGCCC 


TATAATCAAG 


ATAAACTAAG 


ATTTTGGAGG 


7020 


AAAAATGAGT AATATCAGTT TAACAACACT TGGTGGTGTG CGTGAGAATG GAAAAAATAT 


7080 


GTACATTGCT 


GAAATTGGAG 


AGTCCATTTT 


TGTTTTGAAT 


GTAGGGTTAA 


AATATCCTGA 


7140 


AAATGAACAA 


TTAGGGGTCG 


ATGTGGTGAT 


TCCAAACATG 


GATTACCTTT 


TTGAAAATAG 


7200 


CGACCGTATT 


GCTGGGGTTT 


TCTTGACCCA 


CGGGCATGCG 


GATGCCATTG 


GTGCTCTACC 


7260 


GTATCTCTTG 


GCAGAGGCTA 


AAGTTCCTGT 


ATTTGGGTCT 


GAGTTGACCA 


TTGAGTTGGC 


7320 


AAAGCTCTTT 


GTCAAAGGAA 


ATGATGCCGT 


TAAGAAATTT 


AATGATTTCC 


ATGTCATTGA 


7380 


TGAGAATACG 


GAGATTGATT 


TTGGTGGGAC 


AGTGGTTTCC 


TTCTTCCCTA 


CGACTTACTC 


7440 


CGTTCCAGAG 


AGTCTGGGAA 


TTGTCTTGAA 


GACATCGGAA 


GGAAGCATCG 


TTTATACAGG 


7500 


TGACTTCAAA 


TTTGACCAAA 


CGGCTAGTGA 


ATCTTATGCA 


ACTGATTTTG 


CTCGTTTGGC 


7560 


AGAGATTGGT 


CGTGACGGCG 


TCCTGGCTCT 


CCTCAGTGAT 


TCGGCCAATG 


CAGACAGCAA 


7620 


TATTCAGGTG 


GCTAGTGAAA 


GTGAAGTTAG 


GGATGAAATT 


ACCCAAACTA 


TTGCTGACTG 


7680 


GGAAGGTCGT 


ATCATCGTTG 


CAGCTGTTTC 


CAGTAATCTT 


TCTCGTATTC 


AGCAGATTTT 


7740 


TGACGCTGCG 


GATAAAACAG 


GTCGACGTAT 


CGTCTTGACA 


GGATTTGATA 


TTGAAAATAT 


7800 


CGTCCGCACA 


GCGATTCGTC 


TTAAGAAGTT 


GTCTTTAGCC 


AACGAAATTC 


TTTTGATTAA 


7860 
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GCCTAAAGAT ATGTCTCGCT TTGAAGACCA TGAGTTGATT ATTCTTGAGA CAGGTCGTAT 7920 

GGGTGAGCCT ATCAATGGAC TTCGTAAGAT GTCGATTGGT CGCCATCGTT ATGTAGAAAT 7980 

CAAGGATGGG GACCTAGTCT ATATTGCTAC GGCTCCGTCT ATTGCTAAAG AAGCCTTTGT 8040 

TGCGCGTGTG GAAAATATGA TTTATCAGGC AGGTGGGGTT GTCAAATTGA TTACCCAAAG 8100 

TTTACATGTA TCAGGGCACG GAAATGTGCG TGATTTGCAG CTGATGATCA ATCTTTTGCA 8160 

ACCTAAGTAC CTCTTCCCTG TCCAAGGGGA GTATCGTGAG TTGGATGCTC ACGCTAAGGC 8220 

TGCCATGGCA GTTGGGATGT TGCCAGAACG CATCTTCATT CCTAAAAAGG GGACGACCAT 8280 

GGCTTACGAG AATGGAGACT TTGTTCCAGC TGGATCGGTT TCAGCAGGAG ATATCTTGAT 8340 

TGATGGGAAT GCCATTGGTG ATGTTGGAAA TGTTGTTCTT CGTGACCGTA AGGTCTTGTC 8400 

AGAGGATGGA ATTTTCATCG TGGCTATTAC AGTCAACCGT CGTGAGAAGA AAATTGTGGC 8460 

TAGGGCTCGT GTTCACACGC ' GTGGATTTGT TTATCTCAAG AAGAGTCGCG ATATTCTCCG 8520 

TGAAAGTTCA GAATTGATTA ACCAAACGGT AGAAGAGTAT CTTCAAGGAG ATGACTTTGA e580 

CTGGGCAGAT CTCAAAGGTA AGGTTCGTGA CAATCTGACC AAGTACCTCT TTGATCAAAC 8640 

CAAGCGTCGC CCAGCCATTT TACCAGTAGT CATGGAAGCA AAATAATCGT TGAAATAAAC 8700 

AGAGAGAAAG TCGAGTTTCG GCTTTTTCTT ATAGAAAAAT AGAAGGAGAA AATCATGGCA 8760 

GTGATGAAAA TCGAGTATTA CTCACAAGTA TTGGATATGG AGTGGGGGGT GAATGTCCTC 8820 

TACCCTGATG CCAATCGAGT GGAAGAACCA GAGTGTGAAG ATATTCCCGT CTTGTACCTT 8880 

TTGCACGGGA TGTCTGGAAA TCATAATAGT TGGCTTAAGC GGACCAATGT AGAACGCTTG 8940 

CTTCGAGGAA CTAATCTCAT CGTTGTTATG CCCAATACCA GCAATGGTTG GTACACCGAT 9000 

ACCCAGTATG GTTTTGACTA CTACACGGCT CTAGCAGAGG AATTGCCACA GGTTCTGAAA 9060 

CGCTTCTTCC CTAATATGAC GAGCAAGCGT GAAAAGACCT TTATCGCTGG TCTTTCTATG 9120 

GGAGGCTACG GCTGCTTCAA ACTGGCTCTT ACGACAAATC GTTTTTCTCA TGCAGCTAGT 9180 

TTTTCAGGTG CCCTCAGCTT TCAAAACTTT TCTCCTGAAA GTCAAAATCT GGGAAGTCCA 9240 

GCCTACTGGA GAGGTGTTTT TGGAGAGATT AGAGACTGGA CAACTAGTCC CTATTCTCTT 9300 

GAAAGTCTGG CTAAAAAATC GGATAAAAAG ACCAAACTTT GGGCGTGGTG TGGCGAACAG 9360 

GATTTCTTGT ACGAAGCCAA TAATCTCGCA GTGAAAAATC TCAAAAAACT AGGTTTTGAT 9420 

GTGACCTATA GCCATAGCGC TGGAACTCAC GAGTGGTACT ACTGGGAAAA ACAATTGGAA 9480 

GTTTTTTTAA CAACCCTACC AATTGATTTC AAATTAGAAG AGAGACTGAC TTAGTTTGAA 9540 

CTTCAGCATA GGGGGAGTAG AACTAAAATA AAATATGTTT TCACTAGACT TTTCAAACGm 9600 
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AAGTAGTAGA ATAGTAATAA AATACTGGAG GAAAGAGAGT AGGAAATGTA CCGTTATCAA 9660 

ATTGGCATTC CCACATTAGA ATATGATCAG TTTGTCAAAG AACATGAATT AGCCAATGTA 9720 

TTACAAAGTA GTGCTTGGGA GGAAGTTAAG TCTAATTGGC AACATGAGAA GTTTGGTGTT 9780 

TACAGGGAAG AAAAATTACT GGCGACAGCT AGTATTTTGA TTAGAACTCT TCCGCTAGGC 9840 

TATAAAATGT TTTACATCCC AAGAGGACCT ATATTGGATT ATGGGGATAA AGAACTCTTG 9900 

AATTTTGCCA TTCAGTCTAT TAAGTCCTAT GCTCGCAGTA AGAGAGCGGT TTTTGTGACT 9960 

TTTGACCCAA GTATTTGCCT ATCTCAAAGT TTAATCAATC AGGAAAAGAC AGAATTTCCT 10020 

GAAAATCTGG CTATTATTGA TAGTTTGCAA CAAATGGGAG TAAGGTGGTC AGGAAAAACG 10080 

GAGGAAATGG GAG AC AC CAT TCAACCTCGT ATTCAGGCGA AAATATACAA GGAAAATTTT 10140 

GAAGAAGATA AACTTTCCAA GTCAACAAAA CAGGCTATTC GAACAGCACG AAACAAAGGG 10200 

CTTGAGATTC AATATGGTGG ACTGGAACTA TTAGATTCAT TTTCGGAGTT GATGAAAAAA 10260 

ACTGAGAAGC GAAAAGAGAT TCATTTGAGG AATGAAGCCT ATTATAAAAA ATTGTTAGAT 10320 

AATTTTAAGG ACAAGGCCTA TATCACCTTG GCCACCTTGG ATGTTTCTAA ACGTTCGCAA 10380 

GAGTTAGAAG AACAGTTAGC GAAAAATAGA GCCTTGGAAG AGACCTTTAC TGAGTCGACT 10440 

CGAACTTCAA AAGTAGAAGC GCAGAAGAAG GAAAAAGAAC GTTTGTTAGA GGAATTGACC 10500 

TTCTTGCAGG AATATATAGA TGTAGGTCAA GCGAGAGTTC CTTTAGCGGC TACTTTGAGT 10560 

TTGGAATTTG GTACTACCTC TGTCAATATA TATGCTGGTA TGGATGATGA TTTTAAACGT 10620 

TACAATGCAC CAATTTTAAC TTGGTATGAA ACGGCTCGCT ATGCCTTTGA ACGAGGTATG 10680 

ATCTGGCAAA ATTTAGGTGG TGTTGAAAAC TCTCTCAATG GTGGACTTTA TCATTTTAAG 10740 

GAAAAATTTA ATCCAACGAT TGAAGAATAC TTGGGTGAAT TTACAATGCC CACTCATCCT 10 BOO 

CTCTATCCTC TGTTAAGACT TGCTCTTGAT TTCCGTAAAA CATTAAGAAA AAAACATAGA 10860 

AAGTAAGTAT ATGGCACTAA CAACACTCAC GAAAGAAGAG TTTCAGACTT ATTCTGATCA 10920 

GGTTTCTTCT CGTTCCTTTA TGCAATCTGT CCAGATGGGG GATTTGCTAG AAAAAAGAGG 10980 

GGCTCGAATT GTTTATCTTG CTTTGAAACA AGAAGGAGAA ATTCAAGTTG CAGCTCTGGT 11040 

TTATAGCCTG CCCATGCTGG GTGGTCTGCA TATGGAACTC AATTCGGGGC CGATTTATAC 11100 

CCAACAAGAT GCTCTTCCAG TTTTTTATGC AGAGTTAAAA GAATATGCCA AGCAAAATGG 11160 

TGTATTAGAG TTGCTTGTAA AACCCTATGA AACTTATCAA ACTTTTGATA GCCAAGGTAA 11220 

TCCAATAGAT GCTGAGAAAA AAAGTATTAT TCAAGATTTG ACTGATTTAG GTTATCAATT 11280 

TGATGGCTTA ACAACAGGTT ACCCAGGTGG AGAACCAGAT TGGTTATACT ATAAAGATTT 11340 

AACTGAATTA ACTGAAAAGA GTTTGCTTAA AAGTTTTAGC AAAAAGGGTA AACCCTTGGT 11400 
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GAAAAAGGCT 


GAAACCTTTG GCATTCGGTT GAAAAAGTTA AAACGTGAAG AACTATCGAT 


11460 


TTTTAAGAAT ATAACAAAAG AAACCTCTGA ACGTAGAGAA TATAGTGATA AAAGTTTAGA 


11520 


ATATTATGAG 


CATTTTTATG ATACTTTTGG AGAACAAGCG GAGTTTCTCA TAGCAAGCTT 


11580 


AAATTTTTCG 


GACTATATGA 


GCAAATTGCA AGGTGAACAA AGTAAACTAG AAGAAAACTT 


11640 


GGACAAGTTG 


CGACTTGATT 


TGAGTAAAAA TCCTCATTCT GAGAAAAAAC AAAATCAACT 


11700 


GAGAGAATAT 


TCTAGTCAAT 


TTGAAACGTT TGAAGTTCGA AAAGCAGAAG CGCGAGACTT 


11760 


GATTGAAAAA 


TATGGAGAAG 


AAGATATTGT TTTAGCTGGG AGTTTATTTG TTTATATGCC 


11820 


TCAGGAAACG 


ACTTATCTCT 


TTAGTGGTTC CTACACTGAG TTTAATAAGT TCTATGCCCC 


11880 


TGCACTGCTT 


CAAAAATATG TTATGTTGGA AAGCATAAAA CGTGGAATAC CTAAATACAA 


11940 


CTTCCTAGGC 


ATTCAAGGGA 


TTTTTGATGG AAGTGATGGT GTTTTGCGTT TTAAACAGAA 


12000 


TTTTAATGGC 


TATATTGTAC 


GCAAAGCAGG TACTTTCCGT TACCATCCAT CGCCTTTAAA 


12060 


ATACAAAGCT 


ATCCAGTTAC 


TCAAAAAAAT AGTAGGACGT TAAGATGAAA AAGTCAGTAT 


12120 


TTAGATTTCT 


TTTAGCTTCT 


TTTAGTAAAA TAATTCTTAT TTGCTAGAAA GGTGGAGAGA 


12180 


CATGCGCTGG 


CTTTTTCGTT 


TGATAGGGGC TTTCTTTTCT TTTGTGTGGC GTTTGTTTTG 


12240 


GCGTCTGGTT 


TGGATAGTTG 


TGCTCTTATG TGTGCTTGCT TTCGGACTTC TCTGGTATCT 


12300 


GAACGGAGAT 


TTTCAAGGAG 


CGCTAAAGCA AGCAGAACGG TCAGTAAAAA TTGGTCAACA 


12360 


AAGTATTGAC 


CAATGGGAGA 


AAACAGGGCA ACTGCCTAAG TTAAGCCAGA CAGATAGTCA 


12420 


CCAGCATTCT 


GAAGGAAGGT 


GGGCACAGGC CTCTGCTCGT ATTTACCTGG ATCCGCAGAT 


12480 


GGATTCACGC 


TTTCAAGAGG 


CTTATTTAGA AGCAATCCAG AACTGGAATC AAACTGGTGC 


12540 


TTTTAACTTT 


GAACTCGTGA CTGAGTCTAG TAAGGCGGAT ATTACGGCTA CGGAGATGAA 


12600 


CGACGGAGGC 


ACTCCTGTGG 


CAGGAGAGGC GGAAAGTCAA ACTAATCTCT TAACAGGGCA 


12660 


ATTCTTGTCC 


GTAACGGTGC GGTTGAATCA TTATTATTTG TCCAATCCAT ACTATGGCTA 


12720 


CTCCTATGAA 


CGCCTTGTCC ATACGGCAGA ACATGAGTTA GGTCATGGGA TTGGCTTGGA 


12780 


CCATACAGAT 


GAGAAGTCTG 


TCATGCAACC AGCAGGTTCC TTTTATGGTA TCCAGGAAGA 


12840 


GGATGTTGCA 


AACCTCCGAA 


AAATATATGA GACTAGTGAG TAGGGTACTA TCTTTCCCTA 


12900 


CTTTTTTTGC 


TATAATGGAA CTATGAACAA CTTGATTAAA TCAAAACTAG AGCTCTTGCC 


12960 


GACCAGCCCT 


GGTTGCTACA 


TTCATAAGGA TAAAAATGGC ACCATTATCT ATGTAGGAAA 


13020 


GGCTAAAAAT 


CTGCGTAATC 


GAGTACGGTC CTATTTTCGT GGAAGTCATG ATACCAAGAG 


13080 


AGAGGCTCTG 


GTGTCTGAAA 


TTGTGGATTT TGAATTTATT GTTACGGAGT CTAATATTGA 


13140 



WO 98/18931 



PCT/US97/19588 



474 



GGCACTTCTC 


CTAGAAATCA ACCTGATCAA GGAAAACAAH rrraanTirn 


A 1 A 1 LA luL I 


13200 


CAAGGATGAC 


AAGTCCTATC CTTTCATPAA A ATP A PP A at rzunrTir^T 1 a'pr* 


t_AL<jl~ 1 I\jAT 


13260 


TATCACTCGT 


CAGGTCAAAA AGGACGGAGG TCTTWATTTT cnarrfPHTP 




13320 


GGCAGCCAAT 


GAAATCAAGf GtTnYSPTfiflA , P^'t^^;A'Pl^TTT , PP^wnwwira 


AGTGTACCAA 


13380 


CCCGCCCTCT 


mtuuiuiuii ulftllALtn lnl\AiliLU\\2 xljlATvSGCCC 


ACACCATCTG 


13440 


TAAGAAGGAT 


GAGGCTTATT TCAAGTCTAT GGCCCAGGAG GTGTCTGATT 


TTCTGAAAGG 


13500 


TCAGGATGAC 


AAAATCATCG ATGATCTCAA GAGTAAAATG GCAGTAGCAG 


CACAAAGTAT 


13560 


GGAGTTTGAA 

*J VJx»Vj XXX 


CGTGCGGCGG AATACCGTGA CCTGATTCAG GCTATTGGAA 


CGCTTCGAAC 


13620 


CAAGCAACGG 


GTCATGGCGA AAGATTTGCA AAATCGCGAT GTCTTTGGCT 


ACTATGTGGA 


13680 


TAAfififJtTWl 

X fWUUUV* X VJV? 


ATGTGTGTGC AGGTTTTCTT TGTCCGTCAG GtAAGCTCAT 


CGAGCGCGAT 


13740 


mr* a ATrrvr 

u 1. v* An X X X 


TCCCCTACTT CAATGATCCA GATGAGGATT TTTTGACCTA 


TGTAGGACAA 


13800 


TTCTATCAAG 


AAAAATCTCA TCTAGTTCCC AATGAGGTAC TGATTCCGCA 


GATATTGACG 


13860 


AAGAAGCTGT 


CAAGGCTTTG GTGGATTCCA AGATTCTTAA GCCTCAACGT 


GGAGAGAAAA 


13920 


AACAACTGGT 


CAATCTAGCC ATAAAAAATG CTCGTGTTAG TCTAGAGCAG AAGTTCAATC 


13980 


TGCTAGAAAA 


ATCTGTCGAA AAGACTCAAG GAGCTATTGA AAATCTAGGG 


CGTTTGCTCC 


14040 


AAATCCCGAC 


CCCAGTACGT ATCGAGTCCT TCGATAACTC TAATATCATG 


GGAACTAGCC 


14100 


CTGTTTCGGC 


TATGGTGGTC TTTGTCAACG GTAAACCGAG TAAGAAGGAT 


TACCGTAAGT 


14160 


ACAAGATAAA 


AACGGTTGTT GGACCAGACG ACTATGCCAG CATGAGAGAG 


GTCATTCGCA 


14220 


GACGCTATGG 


TCGAGTACAG CGTGAGGCTT TGACTCCTCC AGATTTGATT 


GTGATTGATG 


14280 


GGGGGCAAGG TCAAGTCAAT ATCGCTAAGC AGGTTATCCA AGAGGAACTG 


GGCTTGGATA 


14340 


TTCCAATTGC 


TGGGCTGCAA AAGAATGATA AGCACCAAAC CCATGAATTG 


CTCTTTGGAG 


14400 


ATCCGCTTGA 


GGTGGTGGAT TTGTCTCGCA ATTCTCAGGA ATTTTTCCTC 


CTCCAACGCA 


14460 


TCCAAGATGA 


GGTGCACCGC TTTGCTATCA CTTTCCACCG CCAACTGCGC 


TCCAAAAATT 


14520 


CTTTCTCATC 


TCAATTGGAT GGGATTGACG GTCTGGGACC TAAACGCAAG 


CAGAATCTTA 


14580 


TGAAGCATTT 


CAAGTCTTTG ACCAAAATCA AGGAAGCCAG TGTGGATGAG 


ATTGTCGAAG 


14640 


TTGGGGTACC 


TAGAGTCGTT GCAGAGGCTG TGCAAAGAAA GTTGAACCCG 


CAGGGAGAAG 


14700 


CCTTGCCTCA AGTAGCAGAA GAAAGAGTAG ATTACCAAAC GGAAGGAAAC 


CACAATGAAC 


14760 


CATAAAATCG 


CAATTTTATC AGATGTTCAT GGCAATGCGA CGGCGCTAGA AGCAGTGATT 


14820 


GCAGATGCTA 


AAAATCAAGG GGCCAGTGAA TATTGGCTTC TGGGAGATAT TTTTCTTCCT 


14880 


GGTCCAGGCG 


CAAATGACTT AGTCGCCCTG CTAAAGGACC TTCCTATCAC 


AGCAAGTGTT 


14940 
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CGAGGCAATT GGGATGATCG 


TGTCCTTGAG GCTTTAGATG 


GGCAATATGG 


CTTAGAAGAC 


15000 


CCACAGGAAG TTCAGCTCTT GCGTATGACA CAGTATTTGA TGGAGCGAAT GGATCCTGCA 


15060 


ACGATTGTCT GGCTACGAAG 


CTTGCCTTTG CTGGAAAAGA 


AAGAAATTGA 


CGGATTGCGC 


15120 


TTTTCTATCT CTCATAATTT 


ACCTGACAAA AACTATGGTG 


GTGACTTGCT 


AGTTGAGAAT 


15180 


GATACAGAGA AATTTGACCA 


ACTGCTAGAT GCGGAAACGG 


ACGTGGCAGT 


TTATGGTCAT 


15240 


GTTCACAAGC AGTTGCTTCG 


TTATGGAAGT CAAGGGCAAC 


AAATCATCAA 


TCCAGGGTCG 


15300 


ATTGGCATGC CCTATTTTAA TTGGGAGGCG TTAAAAAATC ACCGTTCCCA GTATGCCGTG 


15360 


ATAGAAGTTG AAGATGGGGA ATTACTCAAT ATCCAATTTC GTAAAGTTGC 


TTATGATTAC 


15420 


GAAGCTGAGT TAGAATTGGC 


CAAGTCCAAG GGGCTTCCCT 


TTATCGAAAT 


GTATGAAGAA 


15480 


CTGCGTCGTG ACGATAACTA 


TCAGGGGCAC AATCTGGAAT 


TATTAGCCAG 


CTTAATAGAA 


15540 


AAGCATGGGT ATGTAGAGGA 


TGTGAAGAAT TTTTTTGATT 


TTTTGTAAGA 


GTTTCCTAAA 


15600 


ATAGCCAATG CAAACTAAAA 


AAGCGATTTG CTGGTCCAAT 


CGCTTTTAGT 


ATATCTTATA 


15660 


CTCAATGAAA ATCAAAGAGC 


AAACTAGGAA GCTAGCCGTA 


GGTTGCTCAA 


AGCACAGCTT 


15720 


TGAGGTTGCA GATAAAGCTG 


ACGTGGTTTG AAGAGATTTT 


CGAAGAGTGT 


TATTGTAACT 


15780 


GAGATTGATC TGGGAGGTAA 


GAACCACCTA GATAGGTATT 


GCTGAGTTTT 


TCAAGGGTTC 


15840 


CGTCTTGATA GAGTTCTTTG 


AGCGCTTTAT CAAATTGCTC 


TTTAAACTCT 


TTTTGGTCGC 


15900 


TTGAGAAAAT GATATAATTG 


CTGGGGCTAT CTGCAGAAGG 


TAAATCAACG 


ACTGAGAGGT 


15960 


CTAAACCACG GTCCTTGATA 


ATCTTTTGAA CGGATACCTT 


GTCAAAAACT 


AGGAAATCAA 


16020 


ACTCTCCGTT AGCAAGGTCT 


AGGATTCGTT TACCAATATC 


CTCACCAGAA 


AAATTAATTG 


16060 


TAGCGGGATT ATCAGTGTGT 


TTCTGATTCC AGTTATTGAT 


GAATTGAGCG 


TTAGAAGTTC 


16140 


CGGTATCCTC TTGTGTTGTT 


TTACCAGCGA TCTGGTCAAG 


AGAAGTCAAA 


GGATTTTTCT 


16200 


TGTTGCTGAC AAGGACGAGG 


GGATTGTTGG AAATTGGAAG 


CGAGTAAAGG 


TATTTTTCAG 


16260 


CACGCTCTTT TGTGTAACTC 


AAGTTATTGG CCGCAGCCTG 


ATAGTGACCA 


GAATCAAGTC 


16320 


CTGGGAAGAT GCTCTCCCAG 


GCGGTTCTTT GGAATTGAAT 


CTCGTAGTCG 


CTGAGTTTTT 


163B0 


CATCTACTGC CTTTAAAACT 


TCGATATCAA AGCCTGTCAG 


ATTGCGCTTG 


TCTTCGTAGT 


16440 


CAAATGGTGG CACGTCGCCA 


GCTGTAGCAA GGACGATTGT 


CTTTTGAGCG 


CTAGTCTCTT 


16500 


TGGGTGTAGC TTGATTCTCA 


CAGGCAACCA AAAATGGTAG 


GATAGCTAGT 


AATAGGCTAA 


16560 


ATTTTTTCAT ACTGTCTCCA 


TTCAAATGTA AAG 






16593 


(2) INFORMATION FOR SEQ ID NO: 53: 
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<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3510 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



GGGATATCCT TATATCCTTG TTCCTnriA AP raTTftTfWi aTvprrwpr'Aiir' 


AGTTTTTTCA 


60 


CCTTGAATTC CTGGTGCAAT GACAGTAAfiA iTTTrr:! a zvp c a nc a ipr»Tvr 


TTTCGCCGCT 


120 




CCAAAACTTC 


180 


AAGTAAACCT TT r PTAr , Pr ,| T v ? AAAA'PPfimi'P atr^p^n * /-mtv-p pmw>nrpri»crn 


CATGGATTGC 


240 


AATGTGAAGT PTCfiAfiPlTV 1 TT'WrPainft a * rpmrn/-*>rruTi />m >r<> i^rr»y^>^m 

nn* j. ww-tvsx v» lyuftOLft jrv. ill 1 v_l_AAL.A tat-AATTTGTT GTACAGTCGT 


TTGTTGTTTT 


300 


GGCTGTTGTG CTGCTTfiAfyp PTTTTT'&r'nwp wwrvwitp PK^»r>rinjni»ni 


CAATACAACT 


360 


r*r* i^n\~nr\\3n unui X t\f\\3\~\. nuUUUtLAl 1 At. 1 1 11 I I I- A I I iwTCCTCC 


TTTATTCAAA 


420 


AATTCCAGCT AGAACATTTA CTTGTrp^PAA TirTinnin aavpr , r«r' jmtwtir 

" vjnriv -" Ail/* \>i lulLLlnn lAulnALAAA A l i t. C vJA 1 1 A 


AAACAATGAG 


480 


GAAACCACCA ATTTTCTTTA fyPAfVAnVAT aiV^arv^rHTVTV^ fcfpfTwr*mx/-wn» * 


AATATGGCAT 


540 


GACTAGACCT GAAGCTAGTG CCAATArTAA nAAArifAAnr' rpp&nvr>n a n 


AGTGTAAATG 


600 


AGAGTATAAA TCGCTCCTTG CCAAGCGCCA TTClCCTrr&c: infirrrraip 


TGCTAAAACA 


660 


GAACTTAAAA CTGGACCAAT ACAAGGTGTC CAACCAAAGC TAAAGGTAAT 


ACCAAGTAAA 


720 


AAAGCTGACC AATAACGATT AGAATCTGAT TTTTTAAAGG TAAAACTTTT 


TTGAACTTCT 


780 


AATTTCTTCA AATGAAAAAT TTCCATCTGG TGAAGACCCA AAATGATAAT 


AATAGCTCCC 


840 


ATGCCATATC . GAAACCAATT TGCATAGAGA ATATGACCAA AGTAACCAGC 


ACCAAAGCCT 


900 


AGAATAAAGA AAATGAGAGA GATACCAGCG ATAAAGCAAA GTGTTCGAAT 


CAAGCCTGAC 


960 


CAGAGAACCT TTCTCCCAAA CAAAGAAAAG CTTTTTGCAC TTTCTTGATC 


ATCCAATAAA 


1020 


ATCCCAGCAT AGACTGGCAG AAGAGGAAAA ATACAAGGAG AAAAAAAGGA 


TAAAACACCT 


1080 


GCTAGAAAAA CAGAGATTAA AAATACTATC GTTTCCAATA AAGAACCAAC 


TTTCTTAATA 


1140 


ATTCTAATCC TATTTTACTA TATTCAATTT TATTTGTAAG CTTTCTGCTA 


CGCAAAATCG 


1200 


TATCGGGCAC TATTGGACCA ATCTTTTCTT TTGCTAGTCA AGGCGGATCT 


TATCCCCCAA 


1260 


AATAGCCAAA AAGCAACGAC AAGGATTACT CATCGCTGCT TTTGTGAACG 


AAAATGTCTT 


1320 


TTAGGTCTGA CATTTCATAA ATCATGTTTT ACTTGAGTTT GTCAAGGATT 


GCTTTAAGCT 


1380 


CCTCTACTAG TTTAGTTTCT GTCTCTGCTG AGCCATTTTC TTCTTTCACG 


AAATCAAGGG 


1440 


TTTCTTGGAG AAGGTTTTGG GCTTTGGCAA GGACTTTTTT ATCCGCTTTT 


TCTGCATCTA 


1500 
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GCTGTCCTAG 


AACCTTGATC 


AATTCCGTGC 


TTAATTGCTG 


GATTTCTGAC 


TCTTTCTTAC 


1560 


GGCGAATCAG 


CCAGAAGGCA 


ATCACGCCTA 


GGAGGGCAAG 


TAGACTGACC 


ACAATCACTC 


1620 


CTGCCGGAAC 


TGAGTTTGTT 


TCAGTCATCT 


TATCTGAATC 


CTTACTATCT 


TCCGTTCCTT 


1680 


GTTTTGCATC 


CTTCTTGTCC 


TGTGCAGGCT 


TGCTGTCGCT 


AGCATTTGCT 


TTCACATCTT 


1740 


TGAGAGAGTC 


CAAGGCAGCC 


CAGCCTTCAC 


AGACTCTACT 


GCAGTATGCA 


GACCTTACTC 


1800 


TGTCAAGGCA 


CTATCTTCCG 


GAGCTTTTTG 


AGCATCTAGG 


AGGACAGCCT 


TGGTTGCATC 


1860 


GATTTTCGGA 


TCAGATACTG 


TTGCCAAAGC 


TTTCAAGCGT 


TGGTCTAACT 


CTTGACTCAA 


1920 


GGCACGAAGT 


TCAGACTTGT 


CAACTTGCTC 


TTGAGCTTGT 


GTGCTCGTTG 


AGCTAGCCGA 


1980 


AGCGCTTGCT 


ACCACTCTAG 


GATCTTGAGT 


CGGAGCTGAG 


CTTGGAGCTG 


GGACAGGGCT 


2040 


TGCAGGTTGA 


CTAGGAACAG 


TTATGGTATA 


TTGAAACTAG- AATAGTACAT 


ATGGACTTCT 


2100 


AAAACATTGT 


TAGAATTCGA 


TTTTACTGTC 


CTGATCGATT 


TGTCCTATTC 


TTATTTCATT 


2160 


TTACTATAAT 


AACCGATGGT 


GTGGTTAATG 


TTGGTAAGAG 


AAACTTCTGA 


AACCAAGCTT 


2220 


CAAAAAAGTC 


GCTCGTCATC 


GTCTCTTCGT 


AAGTCATTGG 


AGCGATTAAT 


TCACCATTTG 


2280 


TTAGACCTGC AACCAAAGAA 


ATCCTCTGAT 


ATCTTCTTCC 


AGATACTTTG 


CCTCTTATTA 


2340 


ACTGACCTTT TAATGAGCGA CCATATTCTC 


GATAAAAATA 


AGTATCGAAT 


CCTGTTTCGT 


2400 


CAATCTAAAC 


AGGTGCTAGG 


TGCTTTAAAC 


TATTAAAATT 


CTTAAGAAAT 


AAGGCTACTT 


2460 


TTTCTGGGTC 


TTGTTCATAG 


TAGGTGTGGT 


TCTTTTTTTC 


GAGTGTAGCC 


CATAGCTTTG 


2520 


AGCGCATAGT 


GGATGGTAGT 


TGGATGACAG 


CCAAAkTCAG 


AAGCTATTTC 


AGTCAAATAA 


2580 


GCrTCTGGAT 


TGTCAGTAAG 


ATAGTTTTTA 


AGTCTATCTC 


TATCAACTTT 


TCTTGGTTTT 


2640 


GTTCCTTTTA 


CTTGGTGGTT 


TAGCTCTCCT 


GTTTTCTCTT 


TTAGCTTTAA 


CCAGCCATAA 


2700 


ATGGTATTAC 


GTGAGATTTG 


GAAAACGTGT 


GATGCTTCTG 


TTATACTACC 


TATTCGCTCA 


2760 


CAATAAGAGA 


GAACTTTTTT* 


ACGAAAATCT 


ATTGAATATG 


CCATAAGAAG 


ATTATACCAC 


2820 


ATTGTGTACT 


ATTTTTGGTT 


CATTTCACTA 


TAACACAAAA 


TAGATTATTA 


TTACATAACA 


2880 


AAAAAGAGGT 


CTAAACCTCT 


TAACTCAATT 


ACTCCGCCAG 


TAGGACTCGA 


ACCTACGACA 


2940 


TCATGATTAA 


CAGTCATGCG 


CTACTACCAA 


CTGAGCTATG 


GCGGATTAAA 


GCTAAGCGAC 


3000 


TTCCCTATCT 


CACAGGGGGC 


AACCCCCAAC 


TACTTCCGGC 


GTTCTAGGGC 


TTAACTTCTG 


3060 


TGTTCGGCAT 


GGGTACAGGT 


GTATCTCCTA 


GGCTATCGTC 


ACTTAACTCT 


GAGTAATACC 


3120 


TACTCAAAAT 


TGAATATCTA 


TTCAATTTAA 


GAAAACCGTT 


CGCTTTCATA 


TTCTCAGTTA 


3180 


CTTTGGATAA 


GTCCTCGAGC 


TATTAGTATT 


AGTCCGCTAC 


ATGTGTCGCC 


ACACTTCCAC 


3240 
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TTCTAACCTA TCTACCTGAT CATCTCTCAG GGCTCTTACT GATATATAAT CATGGGAAAT 3300 
CTCATCTTGA GGTGGkTtCA CACTTAGATG CTTTCAGCGT TTATCCCTTC CCTACATAGC 3360 
TACCCAGCGA TGCCTTTGGC AAGACAACTG GTACACCAGC GGTAAGTCCA CTCTGGTCCT 3420 
CTCGTACTAG GAGCAGATCC TCTCAAATTT CCTACGCCCG CGACGGATAG GGACCGAACT 3480 
GTCTCACGAC GTTCTGAACC CAGCTCGCGT 
(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20986 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CGGAGAAAAA CATGGCTAAG TCAAACTTTG AAAAAGTAGA ATCAGTTGTT GGCTGGGTTC 
GTGATAAGAA AATCACAGGC TACCGTATCT CTAAAGAAAC GAATGCGCGT GAAATGTCTA 
TCATTGCTCT GGCGCAGGGT CGTGCAAAAG TAAAAAATAT TTCATTTGAA ACAGCCCTAG 
GCCTAATTGA TTTCTATGAA AAAAATTATG AAAAATTTGA AGATTAATCT TTGGATAACG 
GCGGATTCTT GACCTTCAAG TAGTAGAGAT AGAGAATCTG CCTTTTCATT TTGAGGACAG 
CAAAAAGACT GCACGGTTGA TGCAGCCTTT TCTTTTTATT TGAGATAGCG TTGAAGGAAC 
TCTTTTGTTC GGTCTTCTTT AGGATTGGTG AAGAGGTCTT CTGGTTTACC TTCTTCAGCG 
ATCACGCCCT TATCCATAAA GATAACACGG TGAGAGACAT CACGGGCAAA TTCCATTTCA 
TGGGTTACGA CAATCATGGT CAAGCCTTCC TGAGCCAGGT CCTGCATGAT TTTGAGGACT 
TCTCCAACCA TTTCTGGATC GAGAGCTGAT GTTGGTTCAT CAAAGAGAAT AGCGTCCGGA 
TTCATGGAGA GGGCACGAGC GATGGCCACA CGTTGTTTTT GACCACCTGA GAGTTGTTTT 
GGTTTGGCTT GCCAGTAGCG TTCTCCCATG CCGACCTTTT CCAGGTTTTC TTTGGCAATC 720 
TTTTCAGCTT CTGTGCGTTC GCGTTTTAGG ACAGTTGTCT GAGCGACGAT TGTGTTTTCA 
AGAACATTGA GATTTTCAAA GAGGTTAAAG GATTGGAAAA CCATCCCCAA CTTTTCACGG 
TATTGCGTGA GGTCATAGCC TTTTTCGAGG ACGTTTTGTC CATGATAAAG GATTTGTCCA 
TCAGTTGGTG TTTCAAGTAG GTTAATGGAG CGTAGGAAGG TCGATTTTCC GCTTCCAGAG 960 
CTTCCGATGA TAGAGATGAC CTCTCCCTTG TGGACAGTGA GTGAAATGTC TTTTAGCACT 1020 
TCGTTTTGTC CATAGGATTT TTTGAGGTGT TTAATTTCAA GGATTGCTTG TGTCATTATT 1080 
TCAAATCCTC CGTTTGCATT TGGTTAGCAC CTGTAGTGTA GGTATCCATG TCCATTCTGC 1140 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



780 
840 
900 
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GCTCGATAAA GCGTAGGATA CGTGTTACGG TGAAGGTGAG GACAAAGTAA ATCACGGCGA 1200 

TGATTGTAAA TGTCTGGAAG TATTGATAGG TTTGTGTTGC CACGGTATTT CCTGAGAAAT 1260 

AAAGTTCGAC AACAGAGATA ACGTTCAATA CAGATGTATC TTTGATATTG ATGACAAATT 1320 

CATTACCAGT TGCAGGTAGG ATGTTACGGA CTACCTGAGG TAGGACAATC TTACGCATGG 1380 

TCTGGTTATG GGTCATACCA AGAGCAGTCG CAGCTTCAAA TTGTCCCTTG TCAACTGCTA 1440 

GGATACCACC ACGGACGATT TCAGTCATGT AGGCACCGGT ATTGATTGAA ACGATGAAGA 1500 

TAGCAGCCAG TGTACGGTCA AGGTTGATCC CGAAAGCTTG GGCAGTTCCA TAGTAGATAA 1560 

CCATCGATTG AACAATCATT GGCGTACCAC GGAAAATTTC AATGTAGACA TTGAGAACCC 1620 

AGCCGACTAG TTTTTGTAGG CCGTAAATGA CTTTGTTTTC AGAGAGAGGA GCAGTAGGGA 1680 

AGACACCAAT GGCAAGTCCA ATAATGAGAC CTATGATGGT TCCGACGATA GAGATTAAAA 1740 

GAGTGATACC AGCACCAGGC AAGAGTTGTT GCCAGTTTTC AGAAAGAATT TTAGCAACTT 1800 

GGCTAAAGAA ACTACTGCTA GTCTCTTCAG TTGTTGTAGC TTCGGCAGGT TGTTCCTTGA I860 

TCATACGATC CATCAAGGCA ACTTGGTCAT CTTTTGAAAT GGTTTCAATG CTGGCATTGA 1920 

TTTGGCTAAT ACGATTGTCA TTTTTACGAA GCCCGATAGC GATAGCTGTA TCTTCTTCCC 1980 

CAGTTTTGAA ACCAGGTTCT ACTTGAATCA TGTTGAACTT AGAGTTCGCA GCTTCAGCAG 2040 

TCAGTGCTTC TGGACGTTGA GAAACATAAG CATCAATGAC ACCAGCCTCA AGAGCTTGTC 2100 

GCATTTGAGC GAAGTCTCCG ATGGCTGTTT CTTTTTTAGC ACCTGGGATT TGTGCAATCA 2160 

AGTTATAAAG GTAGACCCCT TGTTGAGAAG TGATTTTTGC ACCGTTAAAG TCATCCAAAG 2220 

ATTTAGCACT TGCGTAGGCA GAATCTTTTT TGACAAGCAA AACTGGTTCG CTAGTATAGT 2280 

AACTGCTCGA AAAGGCAATT TCTTGTTTGC GTTCTGCAGT TGGACTCATA CCTGCGATAA 2340 

TCATGTCAAT CTTACCAGAA GTAAGGGCAG GGACTAGACC TTCCCACTTG GTTTTAACAA 2400 

CCAAAGGTTC TTTACCTAAG TCCTTAGCGA TTTTCTTGGC GATTTGAACA TCGTATCCGT 2460 

TGGCATACTG ATTGGTCCCA TCGATTTTGA CAGCTCCGTT GCTATCATCA TCCTGGGTCC 2520 

AGTTAAAGGG AGCATATGCT GCTTCCATAC CGATGCGTAA ATATTCATCG GCTTGAGCAA 2580 

CATTGACAAG TCCTAGCATC AGCAAGAGAC TTGTGAAAAT AGATAAGTAy ATGTGGCTCA 2640 

TGATTTCTCC TATTCTGATC TATTAAAAAA TAACTGTCTC CTATTTTATC GAAAAATGCG 2700 

TAATTTTTCA ACATAAGTAA GTCTTTACTT ACGAAAAAAT GCTATAATGA TAAGAAAGAT 2760 

AAAAAGGGGG CTTAGTTGAT GAAAAAAACT TTTTTCTTAC TGGTGTTAGG CTTGTTTTGC 2820 

CTTCTTCCAC TCTCTGTTTT TGCCATTGAT TTCAAGATAA ACTCTTATCA AGGGGATTTG 2880 
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TATATTCATG 
GACTTTAAGG 
ATTGACCCTC 
AGCGAAGTAA 
GGCGACATAG 
GATATCGCTG 
GAATTTCATG 
TTTAGAGAGG 
CCGGCTAAGC 
AGGGATCAGG 
GTTAGAGAAA 
ATCTCCTTGT 
GTCAAATATG 
TTATCAGAAG 
GGAAAATTCA 
AATGTCTCTA 
TTGTCAAGCT 
CTTTCCAATT 
TCTGATGAAA 
TTGAACCAGA 
TATCGTCCTT 
CTGCCCCTAT 
TACCTCCCTT 
AAGCTTCGAC 
CTCTGGACCA 
GAAAGTATTG 
AAGGTTAGTC 
TATGTAGCTT 
GCTAGTGTCG 
GGTGGCTTCT 



CAGACAATAC 
GCCAAATCGT 
ATCCAAAGAT 
CAGAAGAAGC 
TTGAAGTTGA 
AATTAAATTG 
TAAGGGGAGA 
GAACGATTGA 
GTGGAGTTGA 
GATTGAAAGG 
AAGATCAGAG 
TATTGAGTGT 
CCAAAAATCA 
CAGTCTACTC 
CCTTTGATCA 
TCATTTCAGA 
TTGAGAAAGA 
TGTTTGCGGA 
AACGGATTCA 
TGCAAGAAGG 
TAACTGGTGG 
TTATCGGATT 
TGCCAATACT 
TAGATAATCG 
GTTTTGAAAA 
TGGTCTGGAA 
ATTTGATGAA 
ATGGCTGGCA 
CAAATACAGC 
CTGGAGGCGG 



GGCAGAGTTT 
GGGACTTGGA 
TCAGGCCGCG 
GGATGGTTAT 
CCTCGTCTGG 
GCAACCTCTG 
CAAGGGGGCT 
AAAGAGTAAC 
GTTGCATGCC 
GAATCGTTTA 
TAAACAACTC 
CTGCTTCTAT 
TCGTCTCTAT 
GACCTCCTTG 
ACTTATTCAA 
AGGAGATGCA 
CTGCCTAAAT 
TTACAAGGTA 
AGCAAGAGGG 
AGTGAGAAAA 
GGAAAAGGCC 
TGGTTTGTTC 
TGGTTTTCTA 
TGATGGTGTT 
TATGTTGCGT 
TCGCCTCTTG 
GGTTCATCAG 
CAGTACGTTT 
AAGCACCTAC 
AGGTGGCGGC 



480 
AGACAGAAGA 



CGTGCTGGTA 
AAAAACGGTG 
ACTGTGAGAG 
AACTTAAAAA 
ACAGATAGTT 
GAAAAACTCT 
CTTGATTATA 
TATTGGCCTC 
GAAGAGTTTA 
GTTACTTGGG 
TTTATTTATA 
GAACCACCAA 
GAGGAAGTGA 
GCTACCTTGC 
GTTGGTTTGA 
CTAGCTTTTT 
TCTGATAGTC 
CTTCAACTCA 
CGAGTTTCCT 
TTGCAAGTGG 
TTGTACAGTT 
GGGTTAGTTT 
CTAAATGAAG 
GAGATTGCAC 
GTCTATGCGA 
ATTCAAGTGG 
TATCATTCAA 
TCTGTATCTT 
AGTATCGGTG 



TAGTTTACCA 
AGATGCCTAG 
CAGAACTAGC 
TCTATAATCC 
ATTTACTTTT 
CAGAGTCTAT 
TTTTCCATAC 
CTATCCGTTT 
GGACCGATTT 
ATAAGATAGA 
TCCTCCCTTC 
GAAGAAAGAC 
TGGAATTAGA 
GTCCCTTGGT 
TAGATGTGAT 
GGCTAGTAAA 
CAGGTAAAAA 
TTTATCGTAG 
AATCTTCTTT 
TCTGGGGGCT 
GTATGGGTGC 
TAGACGTTCA 
TGTCTGTTTT 
CGGGAGCTGA 
GATTGGATCA 
CCTTATTTGG 
AAAATCCAGA 
CAGCACAAAT 
CTGGAAGTGG 
CCTTTTAAAG 



GTTTGAGGAG 
CGGGTTTGAC 
AGATGTGACT 
AGGTCAGGAG 
CCTTTATGAT 
TGAAAAGTTT 
AGGGAAACTT 
AGACAATCTT 
TGCTAGCGCT 
AGACTCGATT 
GATCCTTTCC 
CACTCCTTCA 
GCCTATGGTT 
CAAGGGAGCT 
AGACCGTGGG 
AGAAGATGGT 
AGAAGAAACT 
AGCCAAAGTT 
TGAAGAGGTA 
CCCAGATTAT 
CTTGACTATC 
TGGCTATCTT 
CTATTATTGG 
GGTCTACTAT 
GGCTGAAGTG 
CTATGCGGAC 
TATCAATCTC 
GAGCCATTAT 
AAGTTCTGGT 
AGAGCTACCA 



2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 
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TAGACTGAAA AAGTATGATA TAATGGAAGA TAGAAAAAAG ACAAACTATA AGAAAAGTCA 4740 

ATAGTTTTAT CTAAACTATT TCTTATTTCA ATTTGATGAT TTGGCGATGA TTTTAGAGCA 4800 

CGGCAAAAAG CCCTTGAAAA AGTCCATTTT TTCAAAGGTA ATCCTGTGTT AATTTCAGAA 4860 

ATTACATCAC TTTTTGTTCG TCAAATGGCA GCTCTTTTTT AGGATATAAA ACAGGGTTCG 4920 

GATAAGTTTT TTTGCAAGGT GGATGATGGC TACATTGTAA TGTTTTCCTT ATTCTAACTT 4980 

AGTCTTAAGA TAGGCCTTAG AAGCAGGTGA AAAGCGAGGG CATGCTTTGG CAGCTTGTAT 5040 

GAGTGCCCAC CGCAGATGAG GGGAACCCCG TTTGACCATT CTTCCAGCTA AATCAATCTG 5100 

ACCTGACTGA TAAATAGAAG AATCCAGTCC AGCGAAAGCT TGTAATTGAG CAGGATTATC 5160 

AAAGGCATGA ATATTTCGAA TCTCGGCTAA AATGACCGCC CTAAACGATC CCCAATCCCA 5220 

GTAACCGTCG TGATGACCGA GTTGAACTCA GCCATCGAGT CATTGATACA TGTTTCCGCC 5280 

TTGTCAATGA GCCTCTTGTA ATGCTTGATG ATTTCGAATT CACGAGCAGG AGATGTTGTT 5340 

CCGATAGAAC GAGGTGCGAC TGAGAGGATA TCCTGAATTT TAGAAGCGGT CAATCGCTTA 5400 

ATTTCTATCA GCTTATCAAA TCCTGCCTCA ATCCTTTTCT GAGGATTAGG GTAGCGTGTC 5460 

AAGAGTTGGT AGGTATATTC TGAATGCTTT CCAACGATTT TATCCAACTC AGGAAAGATG 5520 

ATATCAAGAC AACGAGTGTA TTGTACTTTC CAATCAGACT GTTTTTGTTG AGACGATGAA 5580 

TATGTCTAGC CAGTATTTTT AGGTCTACTT GCCGATTATC GTGTTGAAAT TGTTCACGAT 5640 

TGGGGTCAGA AAGAAGTTTA AGAGCGATGC CATGAGCGTC TTTCTTATCC GTTTTAGTCT 5700 

TGCGAAGTGA TAATGATTTG GCAAATTCCT TGATGAGCAA AGGATTGTAG GTGTAAACTT 5760 

TATATCCTTG TTCATGCAGG AAGTTCAGTA GATTAAAGGC ATAATGTCCA GTATCTTCAA 5820 

GAGCGATGAG ACAGTCTTGG TTGATCTGTC GAATAGACAG ATCTAAGAGT TCAAAACCAG 5880 

CTTTATTATT TGAAAAAGTG AGTGGTTTAA GAACAGTTTT TCCTGGAACA TTCAAGGCTG 5940 

TAACATCGTG TTTATTTTTA GCGATATCAA TGCCTACATA AAGCATGGGA GTACCTCCAG 6000 

ATATAGTATT TCAAGTCTAC TTGGTTATCG ACGAATTTTT TGCCTTGTTA CCTTAGACGA 6060 

GATCAAACGT CTATGCGTTA TCAAACTCAT TACCAATTGA AACAAAAGCT GTGGTTAGAG 6120 

CCTTTCGGAA ATCGTCAAGC GATTGGAGGA AATGAACTAA TCCATAGTGG CTTATTCCAA 6180 

GTATACCACT TGGGCTTTGG CAGTAGCTAA CTGCGCTAAA TATAATATAG GGAGTAATCT 6240 
ATGTATCTTA TTGAAATTTT AAAATCTATC TTCTTCGGAA TTGTTGAAGG AATTACGGAA ■ 6300 

TGGTTGCCGA TTTCCAGTAC AGGTCACTTG ATTTTAGCAG AGGAATTCAT CCAATACCAA 6360 

AATCAAAATG AAGCCTTTAT GTCCATGTTT AATGTCGTGA TTCAGGTTGG TGCT ATTTT A . 6420 
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GCAGTTATGG TGATTTATTT TAACAAGCTC AATCCTTTTA AACCGACCAA GGACAAACAG 6480 
GAAGTTCGTA AGACTTGGAG ACTATGGTTG AAGGTCTTGA TTGCTACTTT ACCTTTACTT 6540 
GGTGTCTTTA AATTTGATGA TTGGTTTGAT ACCCACTTCC ATAACATGGT TTCAGTTGCT 6600 
CTCATGTTGA TTATCTACGG GGTTGCCTTC ATCTATTTGG AAAAGCGCAA TAAAGCGCGT 6660 
GCTATCGAGC CAAGTGTAAC AGAGTTGGAC AAGCTTCCTT ATACGACCGC TTTCTATATC 6720 
GGACTCTTCC AAGTTCTTGC TCTTTTACCA GGGACTAGCC GTTCAGGTGC AACGATTGTC 6780 
GGTGGTTTGT TAAATGGAAC CAGTCGTTCA GTTGTGACAG AATTTACCTT CTATCTTGGG 6840 

ATTCCTGTTA TGTTTGGAGC TAGTGCCTTA AAGATTTTCA AATTTGTGAA AGCCGGAGAA 6900 

CTCTTGAGCT TTGGGCAATT GTTTTTGCTC TTGGTCGCGA TGGGAGTAGC TTTTGCGGTC 6960 

AGCATGGTGG CTATTCGCTT CTTGACCAGC TATGTGAAAA AACACGACTT CACCCTTTTT 7020 

GGTAAATACC GTATCGTGCT TGGTAGTGTT TTGCTACTTT ACAGTTTTGT CCGTTTATTT 7080 

GTATAAGAAA AACCTTGAAG GGGCAACTCT TCAAGGTTTT ATACTCTTCG AAAATCTCTT 7140 

CAAACCGCGT CAGCTTTATC TGCAACCTCA AAACAGTGTT TTGAGCAGCn CTGCGGCTAG 7200 

CCTCCTAGTT TGCTCTTTGA TTTTCATTGA GCTTTAAAAT CCAGTCATGG TAATCCCCAA 7260 

TAGGCGGACA CCTCTTTCTT TCTTGCTTAA TTCTTCATAG AGTTGCAGGG CTATTTGGCT 7320 

TATCTGACTA GCATCTTGTG TTTTTTGAGC AAGACTTTTT CGTTTGGTAA GAGTTGAAAA 7380 

. GTCCTCGTAG CGGATTTTCA AAATGACAAT TTTTCCAGCT TTTTCTTGTT GATGTAGATT 7440 

GAGAGCGACT TTTTCTGATA GAAGAGTCAG CTCTTTTTTG ATATCTTCCT CAGCAAGGAG 7500 

AATCTTCCCG TAGGTTTTCT CCTTGCCGAT TGATTTACGG ATGCGATTGG ATTTGACTGG 7560 

AGAGTTGTGA ATGCCACGAG CCTTTCGATA CAGATCATAG CCTAGTCTAC CAAAACGGTC 7620 

TATTAGGGTT ACCTCAGGAA CTTCAAGTAA ATCAGCACCA GTAAAAACGC CCATTTGATG 7 680 

AAGACGTTCT ACTGTCTTTT TTCCTACTCC ATGAAATTTG GAAATATCCA TTTGTTTGAG 7740 

AAAATCCTCA GCCTGTTCAG GTAGAATCAC TGTCAAACCA TGTGGTTTTT GATAATCACT 7800 

CGCCATTTTA GCTAAGAATT TGTTGTAAGA AACGCCTGCG GAAGCAGTTA GATGGAGTTC 7860 

TTGCCAGATA TCTTTTTGAA TGAGGCGAGC AATTTTGACC GCTGACTTGA TACCGAGTTT 7920 

ATTTTCTGTC ACATCCAAAT AGGCTTCGTC AATGCTCATG GGTTCAATCA AATCTGTATA 7980 

GCGCTTAAAA ATAGCTCGAA TCTGGAGTCC CACAGACTTG TATTTCTCAT AATTCCCTGA 8040 

GATAAAGACA GCCTGGGGAC AACGTTCATA AGCTTCCTTG GAACTCATGG CAGAATGGAC 8100 

ACCAAAAGCT CTTGCCTCAT AACTACAGGT AGAAACGACT CCCCGTCCAC CTGTTTGCCG 8160 

AGGGTCGCTT CCAATAATGA CAGGTTTTCC TCTGAGTTTA GGATTATCCC TGATTTCCAC 8220 
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TGCAGCAAAA AAGGCATCCA TGTCAATATG GATGATTTTT CTTGACAAAT CATTTAACAA 8280 

AGGAAAAATC AACATGCCTA GCACCTTTTT ATACTCTTCG AAAATCTCTT CAAACCACGT 8340 

CAGCTCTATm TGCAACCTCA AAACAGTGTT TTGAGCAATC , TGCGGCTAGC TTCCTAGTTT 8400 

GCTTTTCGAT TTCCATTGAG TGTTACTGCT TATTyTCTTT TATTATACCC TTTTTTCTGA 8460 

AAAAAAGAAA AAAGGACTTT ATTTTTTCAA AAATATAATA CAGTTTGAAA TAAAATATAG 8520 

ACTGTTTTAG AAAAGAAAGT GTAAAAATAG GGAATTTTCA CTTGTTGAAA TCGGTTACTA 8580 

TATGGTATAC TTGTCTTATG AATGTAACAG ATGACTGTTA CTAGAAAAAA GAGGACATTA 8640 

ATATGGTTGT TAAGACAGTT GTTGAAGCAC AAGATATTTT TGACAAAGCT TGGGAAGGCT 8700 

TCAAAGGCGT AGATTGGAAA GAAAAAGCAA GTGTATCACG ATTTGTACAA GCTAACTACA 8760 

CACCTTATGA TGGAGACGAA AGCTTCCTTG CAGGACCAAC AGAGCGTTCA CTTCACATCA 8820 

AGAAAATTGT AGAAGAAACT AAAGCACACT ACGAAGAAAC TCGTTTCCCA ATGGACACTC 8880 

GTCCAACATC TATCGCTGAT ATCCCTGCTG GATTTATCGA CAAAGAAAAT GAAGTTATCT 8940 

TCGGTATCCA AAACGATGAA CTCTTCAAAT TGAACTTCAT GCCAAAAGGT GGTATCCGTA 9000 

TGGCTGAAAC TACTTTGAAA GAAAATGGAT ACGAACCAGA CCCAGCTGTT CACGAAATCT 9060 

TCACTAAATA TGTAACAACA GTTAACGACG GTATTTTCCG TGCCTACACT TCAAATATTC 9120 

GTCGCGCTCG TCACGCACAC ACTGTAACTG GTCTTCCAGA TGCATACTCA CGCGGACGTA 9180 

TCATCGGTGT TTACGCACGT CTTGCTCTTT ACGGTGCAGA CTACTTGATG CAAGAAAAAG 9240 

TAAATGACTG GAATGCAATC AAAGAAATCG ATGAAGAAAC AATCCGTCTT CGTGAAGAAG 9300 

TAAACCTTCA ATACCAAGCA TTGCAACAAG TTGTTCGCCT GGGTGACCTT TACGGGGTTG 9360 

ATGTTCGCAA ACCAGCGATG AACGTGAAAG AAGCAATCCA ATGGGTTAAC ATTGCTTTCA 9420 

TGGCTGTCTG CCGTGTGATT AACGGTGCTG CTACATCTCT AGGTCGTGTA CCAATCGTAT 9480 

TGGACATCTT TGCAGAACGT GACCTTGCTC GTGGTACATT TACTGAATCA GAAATCCAAG 9540 

AATTCGTTGA TGATTTCGTT ATGAAACTTC GTACAGTTAA ATTTGCTCGT ACAAAAGCTT 9600 

ATGACCAATT GTACTCAGGT GACCCAACCT TTATCACAAC TTCTATGGCT GGTATGGGTA 9660 

ACGACGGTCG TCACCGTGTT ACTAAGATGG ACTACCGTTT CTTGAACACT GTTGACAACA 9720 

TCGGTAACTC ACCAGAACCA AACTTGACAG TTCTTTGGAC TGACAAATTG CCATACAACT 9780 

TCCGTCGCTA CTGTATGCAC ATGAGCCACA AACACTCTTC TATCCAATAC GAAGGTGTAA 9840 

CAACAATGGC TAAAGACGGA TATGGTGAAA TGAGCTGTAT CTCATGCTGT GTGTCTCCAC 9900 

TTGATCCAGA AAATGAAGAA CAACGCCACA ACATCCAGTA CTTCGGTGCT CGTGTAAACG 9960 
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TTCTTAAAGC CCTTCTTACT GGTTTGAATG GTGGTTACGA CGATGTTCAC AAAGACTACA 10020 

AAGTATTTGA TATCGAACCA ATCCGTGACG AAGTTCTTGA ATTTGAATCA GTTAAAGCGA 10080 

ACTTTGAAAA ATCTCTTGAC TGGTTGACTG ACACTTACGT AGATGCCTTG AACATCATCC 10140 

ACTACATGAC TGATAGGTAC AACTACGAAG CTGTTCAAAT GGCCTTCTTG CCAACTAAAC 10200 

AACGTGCCAA CATGGGATTC GGTATCTGTG GATTTGCTAA CACTGTTGAT ACATTGTCAG 10260 

CTATCAAATA CGCTACAGTT AAACCAATCC GTGACGAAGA TGGCTACATC TACGATTACG 10320 

AAACAATCGG TG ACT AC CCA CGCTGGGGTG AAGATGACCC ACGTTCAAAC GAATTGGCAG 10380 

AATGGTTGAT CGAAGCTTAC ACAACTCGTC TACGTAGCCA CAAACTATAC AAAGACGCAG 10440 

AAGCTACAGT ATCACTTTTG ACAATCACAT CTAACGTTGC TTACTCTAAA CAAACTGGTA 10500 

ACTCACCAGT TCACAAAGGT GTATACCTCA ACGAAGATGG TTCTGTGAAC TTGTCTAAAC 10560 

TTGAATTCTT CTCACCAGGT GCTAACCCAT CTAACAAAGC TAAAGGTGGT TGGTTGCAAA 10620 

ACTTGAACTC ACTTTCTAGC CTTGACTTTA GTTATGCAGC TGACGGTATC TCATTGACTA 10680 

CACAAGTATC ACCTCGCGCT CTTGGTAAGA CTCGTGATGA ACAAGTTGAT AACTTGGTAA 10740 

CAATTCTTGA TGGTTACTTC GAAAACGGTG GACAACACGT TAACTTGAAC GTTATGGACT 10800 

TGAACGATGT TTACGAAAAA ATCATGTCAG GCGAAGACGT TATCGTACGT ATCTCTGGAT 10860 

ACTGTGTAAA CACTAAATAC CTCACTCCAG AACAAAAAAC TGAATTGACA CAACGTGTCT 10920 

TCCACGAAGT TCTTTCAATG GATGACGCCT TGGATGCATT GAGCTAATCA AGTTCTTGAA 10980 

TAATAAAAAG GAACCCTCGG TCAAACGACT GAGGGTTTTG TGCTTGGGAT AGTATGAGCA 11040 

ATTCCTTCGG CGCAATATGC AATGTTTTTG GGCTCTTTGT CAACTGTAGT GGGTTGAAAA 11100 

AAAGCTAAGC TTGAGAAAGG ACAAATTTCG TCCTTTCTTT TTTGATGTTC AGGGCGATAA 11160 

AAATCCGTTT TTTGAAGTTT TCAAAGTTCC GAAAACCAAA GGCATTGCGC TTGATGTCTT 11220 

TGATGAGTTT GTTAGTGGCC TCAAGTTTAG CGTTAGAATA AGGCAATTCA ATGGCGTTAG 11280 

TGATGTAGTT TTTATAGCAA ATAAATGTGC TCAAAGTGGT TTTAAAGGTG CGGTTGAGAT H34 0 

GAGGTAACGT GTCTTGAATT AAGCCCCAAA ACTGGTCAGT ATTCTTCTCT TGTAGATGAA 11400 

ATAGGAGTAG TTGATACAGG TCATAGTAAT CTTTAAGTTC AGGTACTAGA GTAAAGATTT 11460 

TCTTCAGACA CTCCCTAGGA GTTAAGGTCT CTCTGAAAGT TCTAGCATAG AAAGGCTTAA 11520 

GAGAGAGTTT CCGACTATCT TTTAGGATAA ATTTCCAGTA ATATTTAAGA GCTCTGTATT 11580 

CCAGAGATTT ATCATCAAAT TGCTTCATGA TGTTGATTCT AGTCTGATTA AGAGCCCTGC 11640 

TCATGTGTTG GACAATGTGG AAACGATCGA GAACAATTTT AGCATTGGGA AATAATTTCT 11700 

TAATGAGAGG GATATAACTT CCAGACATAT CAACAGTGAC GACTTTAACT TTTTTTCTAG 117 60 
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CTTCTTTCGA GTACTTGAAG AAATGATTTC GGATGGTTGT 


TTGACGTCTG 


TTATCAAGAA 


11820 


TGGTCATGAT TTTCTTAGTG TTGAAATCCT GAGCAATGAA 


AGCCAATTTC 


CCCTTCTGGT 


11880 


AGGAGAATTC ATCCCAGGAG AGGATTTCAG GCAAAGTGGT 


GTAATCCTCT 


TGGAAATGAA 


11940 


ATTGCTTGAG CTTACGATAG ACGGTAGAGG TAGAGGTAGA 


GGTAGAGATG 


GCTAATTTAG 


12000 


AAGCGATATG TGTAAGAGCC TCTGTGTTGA GTAGGAGTTG 


GGCAATTTTC 


TGTCTCACCA 


12060 


TTTCCGAGAT TTGGCAATTT TTCTGAACGA GAGTTGTTTC 


AGCTACAGTG 


ACTTTCCGAC 


12120 


AGGACTTGCA TTGAAATCGT CTCTTTTTCA AATGAATGAG 


GCTAGGGAAA 


CCACCAATCT 


12180 


CGATAAAAGG GATTTTAGAA GGCTTTTGGA AGTCGTATTT GATTTGTTTT CCTTTACAGT 


12240 


GTTTACATTT AGGTGGGTGA TAATCAAGTG TAGCGAAGAC 


TTCGATATGG 


GTATCGTGCT 


12300 


GAATGGCTTT ATTTAAGGTG ATGTTTTTGT CTTTTATTCC 


GATGAGTAAT 


GTGGTATGAT 


12360 


TGATGTGTTC CATAAGATAC TTTCTAATGA GTTGTTTAGG 


CGCTTTTCAT 


1 TATAAGTCTT 


12420 


ATGGGACTTT TTTGATACTC AAAAAGCCCT ATAATCTCCA 


CAGTGGGATT 


TACCCACTAC 


12480 


AGAAATTATA GAG CC AG AAA AAACACTTTT GTTCACTAGC 


AGAAACTAGA 


GAGCAGAAGT 


12540 


GTTTTTCTGT TCAGATTTAC CCAAAACTGG GAAATATGGG 


GATAAGAATA 


GAGATGGCTT 


12600 


AGGAAGCCCC TTTTTGTGTG TAGACAGTAC GATGAACTTA 


TAACAAATAG 


TGAGCCTTTT 


12660 


TAGCAATCAT TGCGACCCGT TTGTCAAAAG CCTCTTTTCG 


GATATCTACA 


ATTGTCTGAT 


12720 


AGATGAGACG CTGTTGGCTA ACATGCAAAT CTAAGGCAAT 


CGTCAAAAAG 


TGATGTTTCC 


12780 


CTTTGGGATA CTGCTTTTTA ACGTAAGGCA GGTATTCTTT 


CGTTGTAATA 


ATAATCAATG 


12840 


GCTCTGTCAA ATGCTCCTCT GAAGGAGGAG GACTAATTAG 


AATATTGTAT 


CCTGTAACAG 


. 12900 


AGGCAACTTT GTCAGTAAAA TTCCGTAAAA TAATGGACTT 


TATTAAGTTT 


ACATCTGCTT 


12960 


GATTATTTAA AATGATAAAA ATCGGGATAG CAGGTAGTGA 


GGAAAAGATG 


GTTTCTGTCA 


13020 


AGTAGAGTGA GAAAAGGTAC AGCCGATGCT GGTCGATAAC 


TCCTTCAATC 


TTCTGCTCAG 


13080 


TCATCCACTC TTGAACAATT GCTTTCGAAA TATGATACAG 


TGGCTTGTCG 


CTTTCAATCC 


13140 


CATAATGTTC GTAATAATTA TAATAGGGAA CTAGATTTTG 


TAAACCAAAC 


AAAAACGTTC 


13200 


TTGTTAAGAA AGTCAGTGCT GTTAAAAAAG AAAGAGAATT 


CGAAATGTCA 


TTTCCTAAGA 


13260 


TATTCTTGAA CTTGGATAGT AGATGCTTTC CTCTTGTATG 


CTGAAGAATC 


AGTTGAATAG 


13320 


TATGAGTCTT lWltTO TTCCATTTGT CCTTGGAAAA 


CGAAGAATTA 


GCAGAACAAT 


13380 


AAACCAAAAA GATATAATCC AGTTCTTCCT GAGTAAAAGT 


CATGTTGGCA 


TGTGGCTGTA 


13440 


AGTAAGTTTG GCAATGTTCC ATCAAAATCG GATACATAAA 


GAGGTTTTTT 


AATTTTTGAA 


13500 
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ACTCTTTGGA CTCAGGGAAC TCAAGTGGAA ATTCCCGACG TTTCCAAGTG AGTGCCACTA 13560 

GTATGCTAAA ATGAACATAC TCGTCAGGTG TGATTTCTAA CAGTTCATGA CTGAGTTGAG 13620 

AATTAGACTG CACAATCATA TGTGTGACCC AATCCATACT TCCATCATTC AAATCATAAA 13680 

TCTCAATACC AAAATGAAAC TGGAGGAGTG CAATTAAAAA ACGAATGCGA TATTCAGGAC 13740 

CAACTACTTG ATTTTTCACA AGGTCCAAAC CTACTGAACG TAGTAACAAG CCACACTTTT 13800 

GTCGTACGCG GTAGCCTGTT GCGATGGAAA TATACTCTTT TTGTGTAAAT TCGTTAAAGC 13860 

TTTGATTACC TTGTAGTAGA AAGAAGCGGA GTATTTTTAA AATAGTTGAT TGGTTATAAA 13920 

GCTGATGGAA GTAATAATTC GTTTGATGAG AATGGTGTTC GATTAATTGA ACTTGTTGCG 13980 

TATCTAAATT AAATGTCAAC TCTTCCTCGA ATGTTTCTTG TAATTCCTGC AAAATGCTTA 14040 

GGAGACTTTT AGATTGTAAT GAAGTTAAAG TAGACAGTTC ATCTAGTTCA ATAGACCGAA 14100 

TATCCAATAA TATATTTAAA ATGGTAATTT TATCTGTAAT TCTTTTTTCA ATGTATTTGT 14160 

TTAGCATAGT TACCGAATCT TAGTTGCATA TAGATAATTT TAATTATTAT AATACAAAAG 14220 

AAACTAATTG TCTTGTCAAA AAGGTTGTGG AATTTCCGAC TTTATTGATA AAACAGCATG 14280 

TAATAAAAGG CATTTTAAAG ATAGTAATGA GTATTGGTGG AGTTTTATGG CTTATTTTTT 14340 

TTATTAGAAA ATATTTTTTT ATCAAATATT GTCGTTCTAT AAAAAAATAT GTGATAAAAA 14400 

TATCTATTGT GATGGAAGTT GTTTTAATTT ATACTAGGAT AGTTAATAGT AATACTATAC 14460 

TATACTATAT TGTATACAAG TGTGTCATTG CCAGGTTGAG AAGATAGCTA TAACGCACTT 14520 

TTATACGCTT TTGCTACGTT TGTTAGTGAA CGGATTAACT CAGTGAGATA AATTTTATCA 14580 

GAACATAAGT AATCCGTTTC TTCGTGTATA CAGATTGAAA GTACCTATGA ATCATAGAAG 14640 

GATTAACTTG TTCTATGAAT AATGCTTAAC AGGGAGACAC ACATGAAAAA AGTAAGAAAG 14700 

ATATTTCAGA AGGCAGTTGC AGGACTGTGC TGTATATCTC AGTTGACAGC TTTTTCTTCG 14760 

ATAGTTGCTT TAGCAGAAAC GCCTGAAACC AGTCCAGCGA TAGGAAAAGT AGTGATTAAG 14820 

GAGACAGGCG AAGGAGGAGC GCTTCTAGGA GATGCCGTCT TTGAGTTGAA AAACAATACG 14880 

GATGGCACAA CTGTTTCGCA AAGGACAGAG GCGCAAACAG GAGAAGCGAT ATTTTCAAAC 14940 

ATAAAACCTG GGACATACAC CTTGACAGAA GCCCAACCTC CAGTTGGTTA TAAACCCTCT 15000 

ACTAAACAAT GGACTGTTGA AGTTGAGAAG AATGGTCGGA CGACTGTCCA AGGTGAACAG 15060 

GTAGAAAATC GAGAAGAGGC TCTATCTGAC CAGTATCCAC AAACAGGGAC TTATCCAGAT 15120 

GTTCAAACAC CTTATCAGAT TATTAAGGTA GATGGTTCGG AAAAAAACGG ACAGCACAAG 15180 

GCGTTGAATC CGAATCCATA TGAACGTGTG ATTCCAGAAG GTACACTTTC AAAGAGAATT 15240 

TATCAAGTGA ATAATTTGGA TGATAACCAA TATGGAATCG AATTGACGGT TAGTGGGAAA 15300 
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ACAGTGTATG 


AACAAAAAGA 


TAAGTCTGTG 


CCGCTGGATG 


TCGTTATCTT 


GCTCGATAAC 


15360 


TCAAATAGTA 


TGAGTAACAT 


TCGAAACAAG 


AATGCTCGAC 


GTGCGGAAAG 


AGCTGGTGAG 


15420 


GCGACACGTT 


CTCTTATTGA 


TAAAATTACA 


TCTGATTCAG 


AAAATAGGGT 


AGCGCTTGTG 


15480 


ACTTATGCTT 


CCACTATCTT 


TGATGGGACC 


GAGTTTACAG 


TAGAAAAAGG 


GGTAGCAGAT 


15540 


AAAAACG6AA 


AGCGATTGAA 


TGATTCTCTT 


TTTTGGAATT 


ATGATCAGAC 


GAGTTTTACA 


15600 


ACCAATACCA 


AAGATTATAG 


TTATTTAAAG 


CTGACTAATG 


ATAAGAATGA 


CATTGTAGAA 


15660 


TTAAAAAATA AGGTACCTAC 


CGAGGCAGAA 


GACCATGATG 


GAAATAGATT 


GATGTACCAA 


15720 


TTCGGTGCCA 


CTTTTACTCA 


GAAAGCTTTG 


ATGAAGGCAG 


ATGAGATTTT 


GACACAACAA 


15780 


GCGAGACAAA 


ATAGTCAAAA 


AGTCATTTTC 


CATATTACGG 


ATGGTGTCCC 


AACTATGTCG 


15840 


TATCCGATTA 


ATTTTAATCA 


TGCTACGTTT 


GCTCCATCAT 


ATCAAAATCA 


ACTAAATGCA 


15900 


TTTTTTAGTA 


AATCTCCTAA 


TAAAGATGGA 


ATACTATTAA 


GTGATTTTAT 


TACGCAAGCA 


15960 


ACTAGTGGAG 


AACATACAAT 


TGTACGCGGA 


GATGGGCAAA 


GTTACCAGAT 


GTTTACAGAT 


16020 


AAGACAGTTT 


ATGAAAAAGG 


TGCTCCTGCA 


GCTTTCCCAG 


TTAAACCTGA 


AAAATATTCT 


16080 


GAAATGAAGG 


CGGCTGGTTA 


TGCAGTTATA 


GGCGATCCAA 


TTAATGGTGG 


ATATATTTGG 


16140 


CTTAATTGGA 


GAGAGAGTAT 


TCTGGCTTAT 


CCGTTTAATT 


CTAATACTGC 


TAAAATTACC 


16200 


AATCATGGTG 


ACCCTACAAG 


ATGGTACTAT 


AACGGGAATA 


TTGCTCCTGA 


TGGGTATGAT 


16260 


GTCTTTACGG 


TAGGTATTGG 


TATTAACGGA 


GATCCTGGTA 


CGGATGAAGC 


AACGGCTACT 


16320 


AGTTTTATGC 


AAAGTATTTG 


TAGTAAACCT 


GAAAACTATA 


CCAATGTTAC 


TGACACGACA 


16380 


AAAATATTGG 


AACAGTTGAA 


TCGTTATTTC 


CACACCATCG 


TAACTGAAAA 


GAAATCAATT 


16440 


GAGAATGGTA 


CGATTACAGA 


TCCGATGGGT 


GAGTTAATTG 


ATTTGCAATT 


GGGCACAGAT 


16500 


GGAAGATTTG 


ATCCAGCAGA 


TTACACTTTA 


ACTGCAAACG 


ATGGTAGTCG 


CTTGGAGAAT 


16560 


GGACAAGCTG 


TAGGTGGTCC 


ACAAAATGAT 


GGTGGTTTGT 


TAAAAAATGC 


AAAAGTGCTC 


16620 


TATG ATACGA 


CTGAGAAAAG 


GATTCGTGTA 


ACAGGTCTGT 


ACCTTGGAAC 


GGATGAAAAA 


16680 


GTTACGTTGA 


CCTACAATGT 


TCGTTTGAAT 


GATGAGTTTG 


TAAGCAATAA ATTTTATGAT 


16740 


ACCAATGGTC 


GAACAACCTT 


ACATCCTAAG 


GAAGTAGAAC 


AGAACACAGT 


GCGCGACTTC 


16800 


CCGATTCCTA 


AGATTCGTGA 


TGTGCGGAAG 


TATCCAGAAA 


TCACAATTTC 


AAAAGAGAAA 


16860 


AAACTTGGTG 


ACATTGAGTT 


TATTAAGGTC 


AATAAAAATG 


ATAAAAAACC 


ACTGAGAGGT 


16920 


GCGGTCTTTA 


GTCTTCAAAA 


ACAACATCCG 


GATTATCCAG 


ATATTTATGG 


AGCTATTGAT 


16980 


CAAAATGGCA 


CTTATCAAAA 


TGTGAGAACA 


GGTGAAGATG 


GTAAGTTGAC 


CTTTAAAAAT 


17040 
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CTGTCAGATG GGAAATATCG ATTATTTGAA AATTCTGAAC CAGCTGGTTA TAAACCCGTT 17100 

CAAAATAAGC CTATCGTTGC CTTCCAAATA GTAAATGGAG AAGTCAGAGA TGTGACTTCA 17160 

ATCGTTCCAC AAGATATACC AGCGGGTTAC GAGTTTACGA ATGATAAGCA CTATATTACC 17220 . 

AATGAACCTA TTCCTCCAAA GAGAGAATAT CCTCGAACTG GTGGTATCGG AATGTTGCCA 17280 

TTCTATCTGA TAGGTTGCAT GATGATGGGA GGAGTTCTAT TATACACACG GAAACATCCG 17340 

TAAAGTGTAG AAATGATAAT ATCTATGTTC TGAACGATAC TTTTAAGAAG TAGCACTCAA 17400 

GAAGAGATTT AAGTTTACTT GGTGAAACCT GTTTTATTCG TAAGTAAACT ATCATTGAAA 17460 

GGGGAGATGT TTTCGAAAAC TTGCACAGAA AAAGGATTAT TATTGTCATG TGTAATTCAT 17520 

TACATTGCTC ACAGTTGATT TTAAGAGATA TGAATAAGGA GAAATCATGA AATCAATCAA 17580 

CAAATTTTTA ACAATGCTTG CTGCCTTATT ACTGACAGCG AGTAGCCTGT TTTCAGCTGC 17640 

AACAGTTTTT GCGGCTGGGA CGACAACAAC ATCTGTTACC GTTCATAAAC TATTGGCAAC 17700 

AGATGGGGAT ATGGATAAAA TTGCAAATGA GTTAGAAACA GGTAACTATG CTGGTAATAA 17760 

AGTGGGTGTT CTACCTGCAA ATGCAAAAGA AATTGCCGGT GTTATGTTCG TTTGGACAAA 17820 

TACTAATAAT GAAATTATTG ATGAAAATGG CCAAACTCTA GGAGTGAATA TTGATCCACA 17880 

AACATTTAAA CTCTCAGGGG CAATGCCGGC AACTGCAATG AAAAAATTAA CAGAAGCTGA 17940 

AGGAGCTAAA TTTAACACGG CAAATTTACC AGCTGCTAAG TATAAAATTT ATGAAATTCA 18000 

CAGTTTATCA ACTTATGTCG GTGAAGATGG AGCAACCTTA AGAGGTTCTA AAGCAGTTCC 18060 

AATTGAAATT GAATTACCAT TGAACGATGT TGTGGATGCG CATGTGTATC CAAAAAATAC 18120 

AGAAGCAAAG CCAAAAATTG ATAAAGATTT CAAAGGTAAA GCAAATCCAG ATACACCACG 18180 

TGTAGATAAA GATACACCTG TGAACCACCA AGTTGGAGAT GTTGTAGAGT ACGAAATTGT 18240 

TACAAAAATT CCAGCACTTG CTAATTATGC AACAGCAAAC TGGAGCGATA GAATGACTGA 18300 

AGGTTTGGCA TTCAACAAAG GTACAGTGAA AGTAACTGTT GATGATGTTG CACTTGAAGC 18360 

AGGTGATTAT GCTCTAACAG AAGTAGCAAC TGGTTTTGAT TTGAAATTAA CAGATGCTGG 18420 

TTTAGCTAAA GTGAATGACC AAAACGCTGA AAAAACTGTG AAAATCACTT ATTCGGCAAC 18480 

ATTGAATGAC AAAGCAATTG TAGAAGTACC AGAATCTAAT GATGTAACAT TTAACTATGG 18540 

TAATAATCCA GATCACGGGA ATACTCCAAA GCCGAATAAG CCAAATGAAA ACGGCGATTT 18600 

GACATTGACC AAGACATGGG TTGATGCTAC AGGTGCACCA ATTCCGGCTG GAGCTGAAGC 18660 

AACGTTCGAT TTGGTTAATG CTCAGACTGG TAAAGTTGTA CAAACTGTAA CTTTGACAAC 18720 

AGACAAAAAT ACAGTTACTG TTAACGGATT GGATAAAAAT ACAGAATATA AATTCGTTGA 18780 

ACGTAGTATA AAAGGGTATT CAGCAGATTA TCAAGAAATC ACTACAGCTG GAGAAATTGC 18840 
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TGTCAAGAAC 


TGGAAAGACG 


AAAATCCAAA 


ACCACTTGAT 


CCAACAGAGC 


CAAAAGTTGT 


18900 


TACATATGGT AAAAAGTTTO 


TCAAAGTTAA TGATAAAGAT 


AATCGTTTAG 


CTGGGGCAGA 


18960 


ATTTGTAATT 


GCAAATGCTG 


ATAATGCTGG 


TCAATATTTA 


GCACGTAAAG 


CAGATAAAGT 


19020 


GAGTCAAGAA 


GAGAAGCAGT 


TGGTTGTTAC AACAAAGGAT GCTTTAGATA GAGCAGTTGC 


19080 


TGCTTATAAC 


GCTCTTACTG 


CACAACAACA 


AACTCAGCAA 


GAAAAAGAGA 


AAGTTGACAA 


19140 


AGCTCAAGCT 


GCTTATAATG 


CTGCTGTGAT 


TGCTGCCAAC 


AATGCATTTG 


AATGGGTGGC 


19200 


AGATAAGGAC 


AATGAAAATG 


TTGTGAAATT 


AGTTTCTGAT 


GCACAAGGTC 


GCTTTGAAAT 


19260 


TACAGGCCTT 


CTTGCAGGTA 


CATATTACTT 


AGAAGAAACA 


AAACAGCCTG 


CTGGTTATGC 


19320 


ATTACTAACT 


AGCCGTCAGA 


AATTTGAAGT 


CACTGCAACT 


TCTTATTCAG 


CGACTGGACA 


19380 


AGGCATTGAG 


TATACTGCTG 


GTTCAGGTAA AGATGACGCT 


ACAAAAGTAG 


TCAACAAAAA 


19440 


AATCACTATC 


CCACAAACGG 


GTGGTATTGG 


TACAATTATC 


TTTGCTGTAG 


CGGGGGCTGC 


19500 


GATTATGGGT 


ATTGCAGTGT 


ACGCATATGT 


TAAAAACAAC 


AAAGATGAGG 


ATCAACTTGC 


19560 


TTAAGTAAGA 


GAGAAAGGAG 


CCATTGATGA 


CAATGCAGAA 


AATGCAGAAA 


ATGATTAGTC 


19620 


GTATCTTCTT TGTTATGGCT 


CTGTGTTTTT 


CTCTTGTATG 


GGGTGCACAT 


GCAGTCCAAG 


19680 


CGCAAGAAGA 


TCACACGTTG 


GTCTTGCAAT 


TGGAGAACTA 


TCAGGAGGTG 


GTTAGTCAAT 


19740 


TGCCATCTCG 


TGATGGTCAT 


CGGTTGCAAG 


TATGGAAGTT 


GGATGATTCG 


TATTCCTATG 


19800 


ATGATCGGGT 


GCAAATTGTA 


AGAGACTTGC 


ATTCGTGGGA 


TGAGAATAAA 


CTTTCTTCTT 


19860 


TCAAAAAGAC 


TTCGTTTGAG 


ATGACCTTCC 


TTGAGAATCA 


GATTGAAGTA 


TCTCATATTC 


19920 


CAAATGGTCT 


TTACTATGTT 


CGCTCTATTA 


TCCAGACGGA 


TGCGGTTTCT 


TATCCAGCTG 


19980 


AATTTCTTTT 


TGAAATGACA 


GATCAAACGG 


TAGAGCCTTT 


GGTCATTGTA 


GCGAAAAAAA 


20040 


CAGATACAAT 


GACAACAAAG 


GTGAAGCTGA 


TAAAGGTGGA 


TCAAGACCAC 


AATCGCTTGG 


20100 


AGGGTGTCGG 


CTTTAAATTG 


GTATCAGTAG 


CAAGAGATGT 


TTCTGAAAAA 


GAGGTTCCCT 


20160 


TGATTGGAGA 


ATACCGTTAC 


AGTTCTTCTG 


GTCAAGTAGG 


GAGAACTCTC 


TATACTGATA 


20220 


AAAATGGAGA 


GATTTTTGTG 


ACAAATCTTC 


CTCTTGGGAA 


CTATCGTTTC 


AAGGAGGTGG 


20280 


AGCCACTGGC 


AGGCTATGCT 


GTTACGACGC 


TGGATACGGA 


TGTCCAGCTG 


GTAGATCATC 


20340 


AGCTGGTGAC 


GATTACGGTT 


GTCAATCAGA 


AATTACCACG 


TGGCAATGTT 


GACTTTATGA 


20400 


AGGTGGATGG 


TCGGACCAAT 


ACCTCTCTTC 


AAGGGGCAAT 


GTTCAAAGTC 


ATGAAAGAAG 


20460 


AAAGCGGACA CTATACTCCT GTTCTTCAAA ATGGTAAGGA AGTAGTTGTA ACATCAGGGA 


20520 


AAGATGGTCG 


TTTCCGAGTG 


GAAGGTCTAG 


AGTATGGGAC 


ATACTATTTA 


TGGGAGCTCC 


20580 
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AAGCTCCAAC TGGTTATGTT CAATTAACAT CGCCTGTTTC CTTTACAATC GGGAAAGATA 20640 

CTCGTAAGGA ACTGGTAACA GTGGTTAAAA ATAACAAGCG ACCACGGATT GATGTGCCAG 20700 

ATACAGGGGA AGAAACCCTT GTATATCTTG ATGCTTGTTG CCATTTTGTT GTTTGGTAGT 20760 

GGTTATTGTC TTACGAAAAA ACCAAATAAC TGATATTCAA TGTACATCAT TATGAATAGG 20820 

AT AGCAGGCT GAAGGGAAGA CCAGAGTACT CTGAGGTGAT GTTAATCAGG AATCATGGTG 20880 

ATGTGGCATG AATCATCAAT AACGGATATG AGGCTGGGCA GATTGTGCCA GCCTCATTGT 20940 

GGGTTATTGT TTGTAAAACG ATAGGACTGG TCTGGTAATC ATTTTA 20986 
(2) INFORMATION FOR SEQ ID NO: 55: 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21040 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double ^ 

(D) TOPOLOGY: linear 



<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
CCCAGCAAAA AGCCATCCGA AGATGACTTT TTTGCTATTT AATTTCTGTA TAAGTTACTT 
CCAAGCCACG CTTAACAGCT GGACGATTGG CAATTTTTTC TGCCCATTTT ACTAGATTTT 
GATAACTTGA GGCATCCAAG AATTTTGCAG AACCTTGGTA AAGATTTCCT TGAACTAACT 
GTCCATACCA AGACCAGATA GCAATATCTG CAATCGTATA GTCATTGCCT GCAATATAAG 
GTTTCTGAGC CAATTCCTTA TCCAATAAAT CCAACTGGCG TTTCACTTCC ATCGTAAAAC 
GGTTAATAGG ATATTCCAAT TTTTCAGGAG CATAATTGAA GAAATGTCCA AATCCCCCAC 
CTAGAAAAGG TGCTGCACCT GCTTGCCAGA ATAGCCAATT CAAAACTTCT ACCTTTTCCA 
CAGGATTACT TGGTAAAAAG GCTCCAAATT TCTCAGCAAG GTAAAGAAGA ATATGAGCAG 
ACTCAAAGAC TCTTACGTTT TCAGTACCTG ACTGGTCCAA TAAGGCTGGA ATCTTGGAAT 
TTGGATTGAG CTTCACAAAG TCTGATCCGA ATTGATCCCC ATCCATGATA GCAATCTTAT 
ACAAGTCGTA AGCCGCTTCC TTAAAACCAG CTTCTAGTAA TTCTTCCAAT AAGATAGTAA 
CCTTCACACC ATTTGGTGTT CCCAGTGAAT AAAGCTGAAA AGCTTGTTCT CCTTTTGGCA 
AGTTTTGTTC GAAACGGGCA CCTGCTGTTG GTCTGTTTAG CCCCGTAAAA GCTCCTTGAT 
TACTAGCTTC ATCCTGCCAT ACGGTCGGTA ATTGATATGC TGACATCCGA AACCTCCCTT 
AAATCGCATT CTTGTCAAAA CCGAGTTTGC GTTGAATAAA CTTAACGATT TCGACGATGA 
TAATCATTGA GAAGCTTCCA GCCATAACAA TTCCCCATTG TGACAAGTCT AGTTTGGTTA 
CGTGGAAGAT TCCTTCAAGC GGTTCTACAA CGATTGTTGC CATGAGAAGG ATAAAGGATA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
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CCAAGATGGA CCAGTTAAAG GTCTTAGACT TGAATGGGCC AACTGTCAAG 


ATGGATTGGT 


1080 


AGACAGACTT GACATTGTAG GCATGGAAGA GCTGAATCAA ACCAAGGGTT GCAAAGGCCA 


1140 


TCGTTAGGGC ATCTGCATGA ATAGCATGAT 


TGTCACCCAC 


ATGAACTGGG 


TAAGCAATCG 


1200 


CAAGGCCATA AACACTCATA ACAAGAGCTG 


CTTGGAGTAC 


ACCTTGATAA 


ATGATAGAAC 


1260 


TCAAAACACC ACCTGAGAAG AAGCTTGCCT 


TGCGTCCACG 


TGGTTTATGA 


TTCATGACAC 


1320 


CAGGTTCCGC AGGTTCAACA CCAAGAGCGA 


TAGCTGGGAA 


GGTATCCGTT 


ACCAAGTTGA 


1380 


TCCACAAAAG ATGAACCGGC TGTAAGACAT 


CCCAACCAAA 


CAAGGTTGAT 


AGGAAGATGG 


1440 


TTAATACTTC AGCAGTATTA GCAGAAAGTA 


GGTACTGAAT 


AGTCTTTTGA 


ATGTTTGAGA 


1500 


AGACCTTACG TCCTTCTTCC ACTGCGACGA 


TAATAGTCGC 


AAAGTTATCA 


TCTGCAAGAA 


1560 


TCATATCAGA AGCCCCCTTA GAAACCTCTG 


TACCAGTGAT 


TCCCATACCG 


ATACCGATAT 


1620 


CGGCTGTTTT CAGAGCTGGC GCGTCATTGA 


CACCGTCACC 


TGTCATGGCA 


ACGACTTTAC 


1680 


CTTGTTTTTG CCAAGCCTTG ACGATACGAA CCTTGTGTTC 


TGGAGACACA 


CGGGCATAAA 


1740 


CAGAGTATTG ACCAACGACT TTTTCAAATT 


CTTCATCTGA 


CAGTTCATTG 


AGTTCAGCAC 


1800 


CAGTTAAAAC GTGACCTTCT GTATCGTTTG 


CGTCAATGAT 


TCCCAAACGT 


TTGGCAATGG 


1860 


CTTCCGCTGT GTCTTGGTGG TCACCTGTAA 


TCATAATTGG 


ACGGATTCCC 


GCTTCCTTAG 


1920 


CCACACGAAC AGCCTCAGCG GCTTCAGGAC 


GTTCAGGGTC 


AATCATCCCA ATCAAACCAG 


1980 


TAAAAATTAA ATCATTTTCA AGCTCTTCAG AAGTGAGATT TTCTGGAATA 


CTATCGATAA 


2040 


TCTTATAAGC ACCTGCAAGG ACACGCAAGG CTTGATGAGC CATTTCAGAA 


TTGTTTGTAC 


. 2100 


GAATGAGATT TGTAACCTTC TCATCAATCG GAGCAATATC CCCAGCCTTA TGACGAAGAA 


2160 


GACAACGTTT TAAGAGTTGG TCTGGCGCAC CCTTGACTGC TACAAGGAAA CGACCATCTG 


2220 


GCAATGGGTG AACTGTTGAC ATGAGCTTAC GGTCAGAGTC 


AAATGGCAAT 


TCAGCTACAC 


2280 


GAGGATATTT CTCTAAGAAA CCTTTGACAT 


CATAGCCCTT 


GTCCAAGGCA 


TATTGGATAA 


2340 


AGGGTGTTTC GGTTGGGTCA CCAATCAAGT 


TACCTTCCAC 


ATCGATTTTC 


GTATCATTGG 


2400 


CCAAGACAAC TGAACGAAGT AGTGGCATTT 


CAAGACCTAG 


TTCAATATCA 


TCAGCTGAGT 


2460 


CATGTAGAAC CGCATCGTAG AAGACTTTTT 


CGACTGTCAT 


CTTGTTCATA 


GTCAGCGTAC 


2520 


CAGTCTTATC AGAAGCGATG ATTTCAGTTG 


AACCAAGTGT 


TTCAACTGCT 


GGCAACTTAC 


2580 


GAACGATGGA ATGTCGTTTG GCCAAAACTT 


GAGTACCAAG 


AGAAAGAACG 


ATGGTAACGA 


2640 


TAGCAGGAAG TCCTTCTGGA ATGGCTGCAA 


CGGCAAGGGC 


AACAGAAGTC 


AACAACTCAG 


2700 


CAAGTGGATT TTTCCCTTGA ATGAAGACAC 


CCACTACAAA 


AGTAACAAGG 


GCAATGACCA 


2760 
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AGATAGCATA GGTCAAGACC TTAGAAAGGT TGTTCAAATT TTGTTTGAGT GGTGTATCAG 2820 

TCTCATCCGC ATCTTGAAGC ATACCAGCAA TATGACCAAC TTCAGTGTAC ATACCTGTAT 2880 

TGACAACAAC ACCCATCCCA CGACCATAGG TTACGTTTGA GTTTTGGAAG GCCATGTTGA 2940 

CACGGTCACC AATACCAGCA TCTGTCGCAA GCTCGACTGA CAAGTCTTTT TCGACTGGTA 3000 

CAGATTCACC TGTCAAGGCT GCTTCTTCAA TTTTAAGAGA GTTGGCTTCT ATCAAACGTA 3060 

GGTCCGCTGG TACCACGTCA CCTGCTTCAA GGGCAACGAT ATCGCCTGGT ACCAATTCTT 3120 

TAGAGTCAAT CTCTGCCATG TGTCCATCAC GAAGAACGCG GGCAACTGGA CTAGACATGG 3X80 

ATTTGAGGGC TTCAATAGCT TCTTCAGCTT TTCCTTCTTG GTAAACACCA AAGGCAGCGT 3240 

TGATGATAAC CACAGCTAGG ATGATAATGG CATCTGCGAT ATCTTCCCCA CCAGAAGTCA 3300 

CGACTGACAA GATTGctGCC GCAACTAGGA TGATAATCAT CAAATCCTTA AATTGCTCGA 3360 

TGAATTTGAC CAAGATTGAT CGTTTCTCGC CTTCTTCGAG TTCATTGTGC CCAAATTCGG 3420 

CAAGGCGCTT TTCCGCCTCA CTTGATGACA AACCTTGCTC GGTCGCATCC ACAGCCTGCA 3480 

AGACCTCTTC AGGGCTCTGA GTATAAAACG CTTGGCGTTT TTGTTCTTTT GACATGTGTC 3540 

TCCTCCTTGA CATTGTGTGC AAAACAGACT CTCTTTCTGT CATAGCTTTT CACGACAAAC 3600 

AAAAAGAAAC CTGTTAATCA TAACAAGTCT CGCTGTTTAA GATAGGGCCG GAAAGCATAC 3660 

TTTTCAGCAT AAAATTCGGA ATGACGACAC TATCACAGGT TTCTGCCAGC TACTCCCTTG 3720 

AGTAGTACCA TTATACCAAA TTTTGGGGAG TTTTCAAAGA GTAAAAACTG CCTTATTTGA 3780 

ATTTTTCCTT GAAAACCAGT ATAATGGTAG AATGCTATGT GACTAGAAAG GAAGTTGAAT 3840 

GAAGCAATCT ATCTCAAATC TCAAGTTAGC TGAGCGTGGA GCCATTATCA GTATTTCGAC 3900 

CTATTTGATC TTGTCTGCAG CCAAATTAGC AGCTGGTCAT CTCCtTCATT CATCCAGTTT 3960 

GGTGGCCGAT GGTTTTAATA ACGTATCGGA CATCATTGGA AATGTGGCCC TCTTAATCGG 4020 

GATTCGGATG GCGCGCCACC TGCAGACCGT GACCACCGTT TTGGTCATTG GAAGATTGAA 4080 

GATTTGGCAA GCTTGATCAC TTCTATCATC ATGTTCTATG TCGGTTTCGA TGTTCTAAGA 4140 

GATACCATTC AAAAGATTCT CAGTCGGGAA GAAACGGTCA TTGATCCTCT TGGTGCAACT 4200 

CTAGGAATCA TTTCTGCAGC GATTATGTTT GTGGTCTATC TCTACAATAC TCGCCTCAGT 4260 

AAGAAATCCA ACTCCAATGC GCTGAAGGCA GCTGCTAAGG ACAATCTTTC TGACGCTGTT 4320 

ACCTCACTTG GAACCGCCAT TGCCATCCTA GCTAGTAGTT TCAATTATCC GATTGTGGAT 4380 

AAACTGGTTG CTATCATCAT CACTTTCTTT ATCTTGAAGA CTGCCTATGA TATCTTCATC 4440 

GAGTCTTCCT TTAGTCTTTC AGATGGCTTT GACGACCGCC TGCTCGAGGA CTACCAAAAG 4500 

GCTATCATGG AAATTCCCAA AATCAGCAAG GTCAAATCGC AAAGAGGTCG CACCTACGGT 4560 
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AGCAACATCT 


ACCTGGATAT 


TACACTAGAG ATGAATCCTG 


ACTTGTCTGT TTTTGAAAGC 


4620 


CATGAAATCG CGGATCAGGT CGAGTCTATG CTGGAGGAGC 


GTTTTGGCGT 


CTTTGATACC 


4660 


GATGTCCATA TCGAACCAGC ACCTATCCCT GAGGATGAAA TTTTAGACAA TGTCTATAAA 


4740 


AAATTGCTTA 


TGCGTGAACA 


ATTGATTGAC CAAGGAAACC 


AACTAGAAGA ACTGTTGACT 


4800 


GATGATTTTG 


TCTATATTCG 


CCAAGATGGA GAGCAGATGG 


ATAAAGAGGC 


TTATAAGACG 


4860 


AAAAAAGAGT 


TAAATTCTGC 


TATCAAGGAG ATTCAAATTA 


CTTCCATCAG 


TCAAAAAAGG 


4920 


AAACTCATCT 


GCTATGAGTT 


AGATGGTATC ATCCATACCA 


GTATCTGGCG 


TCGCCACGAA 


4980 


ACCTGGCAAA 


ATATCTTTCA 


TCAAGAAACC AAAAAAGAAT 


AGAGAAATCC 


TTTCATGAGA 


5040 


CGGGATTTTT 


CTATTCTTTT 


ATACTCAATA AAAATCAAAG 


TGCAAATTAG 


GAAGCCGGTC 


5100 


ACAGGCTGTA 


CTTGAGTCGG 


CAATGTGAAG CCGACATAGT TTGCACTTTG ATTTTCGAAT 


5160 


AGTCTTAACT 


ATCAAATTCA 


CTGAGATACT CATAGCGTTC 


GTATTTTTCA 


AGGAGTGCTT 


5220 


CATTTTTCTC 


ATCCAATTCT 


TTTTGGAGAG TAGCCAGCTT 


ACCAAAGTCA 


GAGCCGTTAG 


5280 


CCTGCATTTC 


CTCTTCAATA 


GCAGCGATAC GTTTTTCCAA 


GGTTTCAATA 


TCACCTTCAA 


5340 


TACTTGCCCA 


CTCCTGCTTT 


TCTTGGTAGG TCATGCGTTT 


CTTGTCTTCT 


CGAACCTTGA 


5400 


CCACTTTTTC 


CTTTTCGGCC 


TTTTGCACTT GATTGGCCAT 


ATCTGTTTCA 


AAAGCTTTTT 


5460 


CATCAAGATA 


GTCGGTGTAA 


TGACCAAAGA AAGGACGAAT 


CTTGCCATCC 


TCAAAAGCGA 


5520 


GAATCTTGGT 


CGCTACGTTA 


TCCAAGAAAT AGCGGTCGTG 


ACTGAGTGTT 


AAAACGGGAC 


5580 


CTGCAAAACC 


TTGCAAGAAA 


TTCTCTAAGA CTGTCAAAGT 


TGCAATATCT 


AGGTCATTGG 


5640 


TTGGCTCGTC 


TAAAAGAAGA 


ACATTTGGTT TTTCCAAAAG 


CAGTTTGAGG 


AGATAAAGAG 


5700 


GTTTTTTCTC 


ACCCCCTGAC 


AATTTCTCAA TCAAAGTCCC ATGCGTCGAA CGTGGGAAGA 


5760 


GGAATTGCTC 


CAGCAACTCA 


GCGATGGAAG TCGTAGAACC 


ACCACTGGTC 


TTGACCTCCT 


5820 


CTGCCACTTC 


CTGCAGGTAA 


TTGATCACAC GCTTGCTTTC 


ATCCAAACCC 


TCAATTTGTT 


5880 


GAGAGAAATA GGCGATGCGA ACAGTTTCCC CAATCACAAC 


TTGTCCTGCT 


GTCGGCTCAA 


5940 


GACTTCCTGC 


AATCAGGTTA 


AGTAGGGTTG ATTTTCCAAC 


ACCATTGTCC 


CCAACAATTC 


6000 


CAATACGGTC 


TTTAGCCTGA 


ACTAAGAGAT TAAAATTTTG 


CAAAATGGGC 


TTATTTTCAT 


6060 


AGGCAAAGGA 


AACATCCTGA 


AACTCGATGA CTTTCTTCCC 


AATCCGACTG 


GTTTCAAAGT 


6120 


TCATAGTCAA 


GTCTGTCTCA 


GCACTACTGC CTGAAACTTC 


CTTTTTCAGA 


TCATGGAAAC 


6180 


GATTGATACG 


AGCTTGTTGC 


TTGGTCGCAC GCGCCTGCGG 


TTGTCTGCGC 


ATCCAGGGCA 


6240 


ATTCTTGTTT 


GTAGAGTTGT 


TCTTTTTTGT GAAGAAGAGC 


CGCGTCGCGC 


TCATCCTGTT 


6300 
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CCGCCTTTAG 
■ CCAACTCGAA 
GGACGGTCTT 
GATGGTTGGT 
ACTGTACCCG 
TGCCCAATTT 
CCATCTCTGC 
TCAATTCATA 
AAACTGTCTT 
TTTTAGCTGA 
AAAGGGTGGT 
TAATAAAGGA 
CGATAAAATC 
CAATTCTCCA 
TGGCTGATAG 
ATGGATAGTC 
AGCTTGACGA 
CTTGCTCAAT 
GGCAAACTGG 
AGCCCATAGT 
ATCAAACAGT 
TTGACTCTCA 
AAACTCGACG 
AGCTTTAAAT 
CATAATCCGT 
TTGCTTTTCC 
GGCAAAGGCG 
GGAAACCGCA 
CTCATACTCC 
TGTTCGCGGA 



GCGAACATAG 
AATCCGTGTT 
CTTAGAATTT 
CGGCTCATCC 
TCTTCTCAGA 
GCTAAGAACG 
CATGACACGT 
CTCACGAATG 
TCTATCATCA 
AAAAGGACTG 
CTTGCCAGTC 
AATATCCCTA 
ACTCATTTTT 
TCGACAATGG 
CCATATTCCT 
AAGCTTTGGT 
AGATTTTCAG 
TCTCCATTTT 
CGTGAGGTCT 
AAAGCCGCCC 
CTGTTGAGCT 
ATCATGGAAG 
AAGGTACGCT 
GTTTCTGGCT 
AAAGCATCTT 
AAATCTTCTA 
TTGACTGTGA 
CTGGGTCTGC 
TCATCCCCAT 
AAAATCTGCT 



TCCTGGTAAT 
GACAAAGCGT 
TTCAAAAAGA 
AAAAGCAAGA 
CCACCTGACA 
GTCTTGACCT 
TCCAAACGCG 
AGCTGGATTT 
AAATCAGGAT 
ACATCCCCAT 
CCATTGACAC 
AAAACGGTCT 
TCTCCCTCAG 
CAAACTCAAT 
TGATCAAAAT 
ATTTTTCTGT 
CCTGTAAAAG 
CACGCAGAGC 
TCCAAGATTT 
AGGCTTGTTC 
TGTCGTGGCT 
CCAAGCCCCT 
CTACAGAAAT 
CAAGTGCAAA 
CGTTGAAACG 
AACCATGGAA 
AATCACGGCG 
GATAGTCCAC 
CTAAGACCAA 
TGGTCTCTTC 



494 
TTCCCTGGTA 



CTAAGAAATA 
GGGTCAGCCA 
GGTCGTGGTT 
ATTCCCCAAC 
GACTTTCGAT 
CCTGCTTGTC 
CCTTGAGTTC 
CCTGAGTCAA 
CAAATCCAGA 
CGATTAAACC 
TGTCACCAAC 
GTAAGCATGG 
CTCTGTTAAA 
ACCGCCATTA 
GATGGCTTGT 
CAAATCTATG 
CAAAATAATC 
CAAAAATGAC 
AGAGGATTCA 
AGATGCCATA 
TCTCCAAAAT 
TTTCTCCAAA 
ACCAAGACTA 
CTCACTAGCC 
CAAGTCAACG 
TTTGAGGTCT 
ATAGACATCC 
GACGGTTCCA 
TGGATAAGAA 



CTCGGTCAAG 
ACGATCGTGA 
CTCAATAATC 
GCCAAGTAAG 
AGGAGTAGAT 
TTCCCAAGCT 
CTCACTATAG 
ACTAGATAGA 
GTAACCAATC 
AACACCAGAA 
AATTCTGTCT 
GGATTTACTT 
ATGGCTTCAC 
ATCTCTCCCA 
ATCTGAATCT 
GGGTTGACTT 
TCAAAGCGAT 
AGCAAATCCT 
TGCGCATTTT 
AAAGTAAAAT 
TCAGGGAGAT 
GGAGCCAGCA 
AGCGGCGTCA 
GCCTGAAAAC 
ACTCCAACTG 
ATTTCTCCTG 
TCTTCTAGCG 
TCTGTCCGAA 
TGCTCGATTC 
GACGTCGCAA 



CCTGCACGAT 
GTGATAAAAA 
GCAATATCCA 
ACTTGTGCCA 
AAGTCTTGAA 
TGGAGAGAGT 
TCGAGCATAA 
ACCGTATCCA 
TGGTAATCAT 
AGGACGTCCA 
AAGTCATGGA 
AGTTTTTCAA 
GATTATTCTC 
AGTCTGGGCC 
CTTTCTTGTC 
CTTTTCCTTG 
AACAATCTCG 
GAACTTGCTT 
CAATCTCCAA 
CAGTCTCCAA 
AGTCATAAGC 
AGAGTTTATC 
AGGTCTTCAT 
GGAAACCACG 
CTCGCAAGAC 
TCTCATCCAA 
ATCGTACAAA 
AGGTTGTTAC 
CGATATCGGC 
TATCCACATC 



6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960^ 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 



WO 98/18931 



PCT/US97/19588 



495 



GTGGATAGGG 


CTATGGAGAA GGGCATCTCG AACAGAGCCC CCAACAAAAT 


AAGCCTCAAA 


8160 


GCCTGCTTCT 


TTAATTTTTT 


CTAATACTGG 


TAAAGCCTTC TGAAATTCAG 


AAGGCATTTG 


8220 


CGTTAATCTC 


ATAATAAGTG 


TTCTAATCCA 


TAGACAAGCT 


CATGACGCTT 


GACAACTTCT 


8280 


TTAATTCCCA 


AATTGACTCC 


TGTCATGAAG 


GAGATGCGAT 


CATAGGAGTC 


ATGACGGAGG 


8340 


GTCAACCCTT 


CTCCCTGATT 


GCCAAAGATG 


ACTTCCTGAT 


GAGCTACCAA 


GCCTGGCAAA 


8400 


CGAACTGAGT 


GGATGCGCAT 


ACCATCAAAG 


TCAGCACCAC 


GAGCACCAGC 


AATCAGCTCT 


8460 


TCCTCATCTG 


CTGCACCTTG 


CTGAATTGAC 


TCTCGAACCT 


CTGCCATCAA 


CTCAGCTGTT 


8520 


TTAATGGCTG 


TTCCACTCGG 


AGCATCCTTT 


TTCTTGTCAT 


GATGGAGCTC 


AATAATCTCC 


8580 


ACATTTGGGA 


AATATTTGGC 


AGCCTGCGTC 


GCAAATTGGA 


TGAGTAAGAC 


AGCACCCAAG 


8640 


GCAAAGTTAG 


GGGCAATCAG 


GCCACCCAAG 


TCTTGGGCAC 


GAGAAAATTC 


TTTTAGCTCT 


B700 


GCAATTTCTT 


CACTCGTGAA 


ACCAGTCGTT 


CCAACTACTG 


GAGCAAAGCC 


ATTTTCAAGA 


8760 


GCAAAACGTG 


TATTTTCGTA 


GGCAACAGCT 


GGAGTAGTAA 


AATCTACCCA 


GACATCCGCT 


8820 


TCAAAACCAG 


CTAAATCAGC 


CTTATCCTTG 


AAAACAGGAA 


TACCCTGCCA 


TTCTGACTCA 


8880 


GACTCAAAAG 


GATCCAAAAC 


TGCCACCAAG 


TCCAAGTCTG 


GATCAGTCAA 


TACCATCTGA 


8940 


CAAGCAGCCT 


GGCCCATCTT 


TCCCTTAAAA 


CCGGCAATAA 


TTACTCGAAT 


ACTCATCTCT 


9000 


ACTCCTGTCT 


AAGATACAAA 


GTCCGTAAGA 


ACACAAAGTG 


AAAATAGGAA 


TTCCAATCAA 


9060 


GAAGTGTCTA 


CTTCTTGGAA 


GAACTATCTT 


TTTCACACAG 


GGTTCCAGGC 


GTGTTCAATT 


9120 


ATCAAGATAC 


AAAGGACCTT 


AGCTGCCTCT 


GAAAAATAGG 


GAATGGCACT 


GACTTTCCAC 


9180 


GAAAGGCAAG 


ACAGGCATCT 


TTTTTCAAGA 


GGCAGGTAGT 


CCGTGTTCAA 


TTTCTAAGAT 


9240 


ACAAGGCATC 


TTAACTAGCC 


TAGAAGCGCC 


AACTAAATCA 


CTGGAATATA 


ACCCAGAGCA 


9300 


ATACTTCCTG 


CTCCTAGGTG 


CGTTCCAATG 


ACACTACCAA 


ATGTAGCAAG 


TGAAACATCC 


9360 


GAACCCAAGC 


CAAAATCAAG 


CAAGTGcTGA 


CGCAATTCTT 


CAGCCTTTTC 


AGGAGCATTC 


9420 


CCATGAATGA 


CAATGACCCG 


GTATTGACCT 


GAAGCCGTTG 


TTTCCTTGAT 


AATTTCAATT 


9480 


AAGCGCTTGG 


TGGCCTTCTT 


TTCAGTACGA 


ACTTTTTCGT 


AAACTTCAAT 


CACACCTTGA 


9540 


TCGTTAAAAT 


AAAGGATTGG 


CTTAATGCTA 


AGCAAATTGC 


CCAAAATGGC 


AGCCCCATTT 


9600 


GAAAGGCGTC 


CACCTTTTAC 


CAAATGATCC 


AAGTCATCTA 


CCATGATAAA 


GGCTGACGTA 


9660 


CGGCTGATTT 


GAATGGCTAG 


CTTATCCTGA 


ATGCTGGCAA 


AATCATCGCC 


CTGATCACGC 


9720 


CAATTAAAGA 


CGCTTTCAAC 


CATGATGCCT 


AGGGGAGCAG 


TTGTAATCAA 


AGTGTCTGGG 


9780 


AAAGCAATGG 


TTAAGCCCTC 


ATAGTCATCG 


ACCATATACT 


GGATATTTTG 


GTAAAAACCT 


9840 
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GAAATTCCAG 
GTTAAGATCT 
TGAGCCATTT 
CCATCAATAT 
CTGAGATAAG 
TCCTGGTAAG 
ACATGGACGA 
ACGGCCATCG 
AATCATAAAG 
CAATTGGAAA 
ATTACGCGCA 
TTTCTCCATC 
TTTCTAGGAT 
AGTAGTTTAC 
CACTTGGACG 
TGGCCTTGAT 
TAACTAATTT 
TTTCCTTCAT 
AATCCATCAA 
TTTCTTCAAC 
TCAAGGCCTT 
ATTGTCCTTT 
TGTCACCATT 
TAGCTGACCC 
TAAGGTTGAT 
CTGCAAAGAT 
TCATTCCATC 
TGGATAATCT 
AGCATCTAGC 
GCCAAAGAAC 



AAGATAGGAA 
CATCTAACTT 
TTTGGTAAAA 
TGACAGGAAT 
CAGAGGAATC 
TCTAATGCAA 
GCCAAGGTTT 
ATATGGTTTA 
CTCAAGATGA 
TTCACATCCA 
TCAAAATGAT 
GGCTACAATA 
TTAGTCAATC 
TTCTTCTGTT 
AACAAGGATA 
AGCTGGCACT 
TTGTGGATAA 
GATTTTAGTC 
GATAACGTGA 
AACGTAGCGG 
GTGGAAACCA 
TTCAGAAAGG 
CTCATCAACA 
ACTTTCTTTG 
GTTAAGACCG 
TTGACGGGCA 
AAGAGGAGTT 
ACCAAAATTC 
AAGGCTTCAA 
TTGATTCCGT 



AAGCCCCAAG 
GGCAATACTT 
TTCCTCAGCA 
ATCCAAGACA 
TGTGAAAACA 
TTTCAGTCAC 
CCACCTCTTC 
CTTGTGAGAT 
CAATCAAGGA 
CCTTGGTTTC 
ACTGACTAAC 
TTATAAGCTA 
CCAATTTCAG 
GTAGGCGCTT 
CGGCCGTTCC 
TCCATGGCCT 
ATCGTTACTT 
AATTGAACTG 
CCAGACTGTT 
TCACCAACTG 
AGGTTAGACA 
TATTTTCCGA 
GCAATCAAGC 
ACCACTTCTT 
TCTGGTGTTT 
CTGGTAGAAG 
CCAGTTGAAA 
CTAAGCCTTC 
TTTCTGCTTC 
TATCAAGGGC 



496 
GCATGTGTAT 



GGTTGACTGG 
GACAGATTGA 
AACAAGTCTT 
GCTAATTTCA 
TTCGTAAGTC 
TTGGTTCAAT 
TGTTCCACTA 
AGTCACTTGA 
AGGAGCTCCA 
AAATTCTTGT 
TTGTACCATA 
CACGAACTAC 
CTGCCATAAC 
CCGCCATTTC 
TTTCCTTCAT 
CTGCCGCCAA 
CTGATAATTG 
CACCACCAAG 
CAGTAACTGC 
TAACAGTTGT 
TGATGTACAT 
GGTCACTGTC 
GAAGGGCTTC 
CCCCGATAAC 
CTGCTCCATT 
CAAGGTATCC 
TGCACTTGGA 
TTTTTCATCA 
TGGGTTGTGG 



AGCCTTGTTC 
TCTTAGGCAA 
TGCCTTCGAC 
CTCTTTGCAA 
TATTAGAACT 
AAACGATTGA 
TCACTTGGTT 
ATGACAAACT 
TTTTCTTGGT 
TTTTCATTTT 
TCACGTTTAA 
ATTTTTTATT 
ATCTGTGATG 
ACGCAAGAGG 
TTCTTCCATC 
GACGTTTTCC 
CTCTGATAAG 
ACCATCACCT 
GTTGTAGCCT 
CTTGTTAATA 
CACAATTGTA 
AATCTTGTCA 
TCCATCAAAG 
TGGATGTGTT 
CGTCAATTGG 
AGCTGTATCC 
TTCATACTTA 
CGAGGAAGAG 
TCTAGTTTGA 
CTAGCAGAAA 



TTTGAGCGAA 
TTCAGAAGCC 
ATATTCCTCA 
GATCTCTGCA 
CCAAATTAAT 
GCATGTTCAA 
CATTGACAAT 
TATCAAATAC 
CATGTTGGAG 
CCCATTCAAA 
GATTCATGTC 
TTCATCTAGT 
GTATCAACAT 
GGTTCTGTTC 
TTCTCGATGA 
ACTCGGATAT 
CTCTTACCAG 
GTGGTATTGT 
GATTTTCTCA 
CCTTCGCGAT 
TTTTGAGCCA 
CCATCAACGA 
GCCAAACCAA 
GAACCAACAT 
GCACCAAGGT 
AAGGCAACCT 
CGCArGCtTC 
TGTCTTCCTC 
AGCCATCACC 
TCATGACACC 



9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
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GGCACTTGCT 


CCTTCAGTTT CAACCAAGTA AGCTACTGCT GGTGTTGCAA GGACACCAAG 


11700 


TTTGTATACG TGAATCCCTA CTGAAAgGAG ACCTGCCACC AAGGCCGATT CCAACATTTC 


11760 


CCCTGAAATA 


CGTGTGTCAC GTCCTACAAA GACTTTCGGC GCTTCCGTTT CATGTTGACT 


11820 


AAGAACATAG 


CCTCCAAAAC GTCCTAGTTT AAAGGCTAAT TCTGGTGTTA GTTCTAGGTT 


11880 


AGCTTCTCCA 


CGGACTCCAT CAGTCCCAAA ATATTTACCC ATTGTTATAA AATCCTTTTC 


11940 


TATTTTTAAT 


TCGTTTTTGA ACTAGTTGCT TTCGTTGACG AAGATGTCTC CGATGAACTG 


12000 


CTTGTACTTG 


AATTTGATGT GCTTGAACTT GGTGCTACTG GTTTTGTAGT CACCTTCATT 


12060 


ATTGTATCAA 


ACGGAGTGAT AACTGCCGGT AAGACAACAC CATTGCGGTC GATTGCCTGC 


12120 


AAAGGTACTG 


AACCACTGTA ATTACCTGTT ATACGTTCGC TAGTTGGCAA AACAGCGATA 


12180 


ATCTTATCAA 


TTCTATCCAA TGTCTCTTGG TCACTCGTAA TAGACACTTC TTTATCTGAC 


12240 


ACCATGACAT 


TTTCAATTTG TACCCGACTA TCAATTTGAC TAGGGTCAAT CTCTGGTACA 


12300 


ATCTTTACCT 


TATCCTTCTG AGCCTTCTTA CCAATCTTGA CTGTAATTTT TTGCGGAGTC 


12360 


GCCACAGCGG 


TCAGCCCATT GGGTAAATCT TCAATGCTCA AAGGAACTTC AATCGTTCCA 


12420 


ACACCGGCAT 


CTGTTAGGTC AGCAGTAACC TTGAATTTAC GTGTACTTTC TTGCATTTCA 


12480 


CTAGCTAGCG 


ATAGGCGATT TGCACCAGTC AAGACCACTG ATACTTCTGA AGCAAAACCG 


12540 


CTAATAAAAT 


ACTTATCACT ATTATAGCGT ATGTCAATAG GGACATTTGT T ACTGT ATT A 


12600 


GTATAGGTTT 


CCGTTTTTAC CTGCCTAGCA CTGGT ACTGT TTTGAAAATT CGTCGCCGTA 


12660 


GCATAGACAA 


ATAAGACACA AGCAAAAAAG AGTGAGGATA TGATATATAA ACTATTTTTT 


12720 


TTCATGTTTC 


CATCCTCCTA GCAATCGTTC TTTAAAACTA AGACCCACTT CCTCTTTTGG 


12780 


AAGTAAGATT 


TCACGTAATT CTGTTTCAAA TTCATCAAGT GTTAGGTTGT GCTTAAACCT 


12B40 


TCCATTATAG 


GTTATCGAAA TTCCTCCCGT TTCGTCTGAT ACGACAAAAG TCAAGGCATC 


12900 


TGAGACTTCT 


GATAAACCGA TAGCCGCCCG GTGTCTGGTC CCAAATTCCT TGGAAATCCC 


12960 


TGTGTTTTTT 


GTCAAGGGCA GATAGGCAGA CGTCACAGCG ATACGTTCTT CTTTGATAAT 


13020 


CACCGCACCA 


TCATGTAGGG GAGTGTTGGG AATAAAAATG TTAATGAGAA GTTCTGCAGA 


13080 


AATCTTAGCA 


TCCAAGGGAA TTCCTGTCGA AATATACTCC TGCAAGGTAC GTACACGCTG 


13140 


AATAGCAACC 


AAGGCCCCGA TTTTACGAGG ACTCATGTAT TCAACAGACT TAACAAAGGC 


13200 


ACGAATCATC 


TGTTCCTCAG CACTAATAGG GGCATTGGAA AAGAAATCTG TCGCTCTTCC 


13260 


CAAACGTTCC 


AAACCAGTCC GAATCTCTGG AGAGAAGATA ACAACCGCCG CAATAACCCG 


13320 


ATAAGTAATA 


ATTTGATTGA TTAACCAAGA AATCGTAGTC AAACCAATCA TATTTGCAAG 


13380 
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GATTTGAGCT 
AATAGCTTTT 
AGCTATCGTC 
AAAATTCATC 
GCATTTTTTT 
ACTAAAAAAA 
GGTCCTGTCG 
ATCGACTCTC 
CTAGACGTAC 
CAGCTAAATG 
CAAGAAGAAT 
GCTATGGGTG 
GAAAATATCC 
CTTGGTGGGG 
ACCCTTGTTC 
GAATCATCTG 
ACACTTGATA 
GACCACCTCT 
CTCGACCTCA 
ATCTATGCTA 
CTCGGAGAAG 
GTACAACCAA 
GTTTTATAAT 
GGTCAGAGGG 
CCTCAGTTGC 
GATTTTACTG 
TGTCTCGGCA 
CTCAAAATCA 
TGCTCGCTGG 
AAACTGTTGG 



AAAATAAACA 
GTAAAATGGT 
CATGGACTTG 
CCTGATATCC 
CCCTATCCTA 
ACAAAGGAGA 
GGCTTTTTGC 
TTCCCCAGCT 
CAGGCTTCCC 
GATTTGATAC 
TTGCCATCAC 
GCGGTGCCTT 
ACTACCACGT 
GAGACTCGGC 
ACCGCAGAGA 
TAACCATCAA 
AACTTGAAAT 
TTGTCAACTA 
ACCGTCACAA 
TCGGTGACTG 
CTCCAACTGC 
AACACTCTAC 
TCATCCGCTA 
GTGCTGAGAC 
GCCACTTCCT 
ATATTGTTAA 
TTATTAGCCC 
TCACGTGCCT 
AGATAGGTCA 
AGAAGGGAGG 



CCAAAACTCC 
ATAAAATATA 
CAAACAAACT 
TCCCTATCAA 
GTCCATTTTA 
AACTATGTCT 
AGCCTTTTAT 
AGGTGGACAA 
AAACCTGACT 
CCCTATTCAT 
AACTTCTAAA 
CAAACCACGT 
TTCTAACATT 
TGTGGATTGG 
TAATTTCCGT 
GACACCATTC 
CACAAAAGTC 
TGGTTTCAAA 
GATTATCGTC 
CTGCTACTAT 
TGTCAACAAC 
TAGTTTATAA 
TCTTATTGAT 
TATCTGCTAA 
GCAAATCTAC 
TGGTCTTCAT 
GATTGAGGTC 
GCATGGCTCC 
CAGCCCCCTT 
CAATTCCTTG 



498 

ACGTACCAAA ATCATAATCT 



AGCAACAATC AAAATATCAA 
GGTCCAATAT TGCAGATTGG 
AACACTTTCG TCCTATTATA 
CATTGAACAA AAATATGATA 
CAACTCTATG AT ATT AC CAT 
GCCCACCTAC GCCAAGCCAA 
CCTGCTATTC TCTACCCTGA 
GGAGAAGAGT TGACTAACCG 
CTCAATGAAA CGGTTCTTGA 
GGAAGTCACC TGACTAAAAC 
CCGCTGGAAC TTGAAGGGGT 
CAGCAATACG CTGGTAAGAA 
GCTTTGGCTT TTGAAAAAAT 
GCCTTGGAAC ACAGTGTTCA 
GCCCCTAGCC AACTCCTTGG 
AAATCTGATG AAACTGAAAC 
TCTTCTGTCG GTAACCTTAA 
AACAGCAAAC AGGAATCCAG 
GACGGAAAAA TTGATCTGAT 
GCTATCAACT ACATTGACCC 
AAAAGAACCA CGAGTCACAT 
TTTTCTGAGT CTGTGATTGA 
CTGCTGGATA GAGTAGTCTG 
TGGCAAATTT TCTAAGCCCA 
GCTGGCAGAA ACTGTCCGAG 
GTTACGGGTT TCTCGCAAAA 
T ATT ACT AT C AAGAAGTCCA 
CTTGCGCTCA AGCACCTTGG 
CGCGTGGTCC AGATAAACAG 



TGGTTCCTGC 
TCAGATTGAT 
ATAATTGTTG 
CCATTTTCTG 
AAATAAACTG 
TGTGGGTGGT 
GGTTCAAATC 
AAAGGAAATC 
CTTGATTGAA 
GATTGACAAA 
AGTTATCATC 
TGAGGGCTAT 
AGTGACGATT 
CGCACCAACT 
AGCCTTGCAA 
AAATGGAAAA 
CATTGACCTA 
AAACTGGGGG 
CCAAGCAGGT 
TGCGACAGGC 
TGAACAAAAA 
AGGATTCGTG 
CACCACTTTT 
GGTGCTGAAT 
TGATATCTTT 
CGATATTAGC 
TCTTAACCCG 
TAATGTCTTC 
CATCCAGTAA 
AACTGATTTC 



13440 

13500 

13560 

13620 

13680 

13740 

13800 

13860 

13920 

13980 

14040 

14100 

14160 

14220 

14280 

14340 

14400 

14460 

14520 

14580 

14640 

14700 

14760 

14820 

14680 

14940 

15000 

15060 

15120 

15180 



WO 98/18931 



PCT/US97/19588 



499 

CAACTGGTAC TTGCCTGACT CAGGGTCACG AATGCTCCCA TTTGCCAAGA AAGCGCCACA 15240 

GAGATAGGCA CGACCTGCTT CCTCATCCGA TAAAATCGCC TCATCAATAC CTGTTTCCAG 15300 

GCCAAAGAAA GAGTCTGCCA AGTGCAAATC ACTTAACAAA TCCTGCACCT TTTCATCTGT 15360 

AAAAACGGTA TAGACGCGAT TCTTGCGAAG ATTGCTCCGT TGGTGGTGAC GAATTTCAGA 15420 

TTTGATTTCA TAGAGATGGA GAAAGGACTC ATAGAGGTGA CGGGCCAGTT TGGCATTTTC 15480 

TGTCACAACT .GACAAAGTCA AGCCCGAAGT CGAGAGACCG ATGCTAGCAG ACATTTTGAT 15540 

AATGGCAGAT AATTCATGCC AGCTCAGATG GTGTTGGCCC AGGATTTCTT CTTTTACTGC 15600 

TACTGTGAAA CTCATTTTTT CACCTGTATA ATGCGCATCA ACTCGTCCAC AATCAAATCT 15660 

CCATCGTGGA AGGCACCGCC ATTTTCCAGA CGAAGGAAGT TAGATGAAAT CACGCGCGAA 15720 

ACTTGCTTAC AAAGACCTAC AAAATCGTGT TCCACTTGCA CTAAGTATTC ATCAAAACGG 15780 

TTGGAATTCA TGTATTCCTG AGGCACTTTT TCAATATTCA CCAAGACAGT GTCGATAAAA 15840 

GGGCGACCAA GGTGACGATG CAAGACTTCC ACGTGGTCGC TATCTGTAAA GTGTTCCGTC 15900 

TCCCCACGTT GGGTCATGAT ATTGCAGACA TAGGCAATTT CTGCCTTGGT TTCCAAAAGA 15960 

GCCCGCCCAA TTTCCTTAAT CACGATATTG GGCAAAATAG AGGTAAAGAG GGAACCTGGC 16020 

CCTAGGACAA TCATGTCACT TTCAAGGATG GTCTGCACTA CTCGACGGCT GGCCAGAGGC 16080 

GTATCATCGT TTAGGGCATT GGTCACATAG ACATTGTCAA TTATGCCTCG ATGGTCTACA 16140 

ATATGACTCT CTCCAGCCAC TTCTGTCCCA TCCTGAAAGA CTGCATGAAG GGTCAAAGGA 16200 

TGGTCACTGG AAGGATAAAT TTTCCCTGTT GTATGGAAAA ATTTGCTCAA TAACTGCATG 16260 

GCATTATAGG TTGAACCCTG CATTTCTGAC AAGCCAGCAA TGATGAGATT TCCCAATGGA 16320 

TGGCCAGCAA AGGCTCCGGC ATCCTCAGAG AACCGATACT GAAAGACCTT CTCATAAAAC 16380 

TTAGGCATAT CCGACATGGC CACAAGGACA TTACGAAGAT CACCTGGCGG TGTCAACTGT 16440 

TGCATATTTT TTCGGAGTTC ACCTGAAGAA CCACCATCAT CTGCCACCGT CACGATAGCT 16500 

GCGATTTCCA CATCTTTTTC CCGCAGACTT TTTAGAATGA CGGGACTTCC AGTCCCTCCA 16560 

CCAATCACCG TTATCTTTGG TTTTCTCATG AACGGTTTAC CGTTTCCTTT CTGCGGTCTT 16620 

TGTCGCGATG CCCTTCATTA ACAGACCAAT TCTTGGATAA GTCCTGCGCC AAGCGTTTAG 16680 

CAAATGCCAC ACTACGGTGT TGTCCACCCG TACATCCCAT GGCAATGGTC AAAACGGACT 16740 

TACCTTCCTT TTGGTAACTT GGCAGAATCG GCTCAATCAA GGCCAATAAA TGTTGATAAA 16800 

AGTCTTCTGA CTCAGGATGG TTCATGACAT AATCATAAAC AGGTTCATCC ACACCCGTTT 16860 

GGTTTCTCAG TTCTGGTAAA TAATAGGGAT TTGGCAAGAA ACGGACATCA AAGACCAAGT 16920 
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CCGCATCAAT CGGGATTCCA TACTTAAATC CGAAAGACAT GACTTCGATA CGGAAAGACT 16980 

GGGCTTGTTC TTGGTCTGAA AACTGCTCTG CAAGGGTTTT GCGCAgCTCA CGTGGAGTGA 17040 

GTTCAGTCGT ATCCACCACA TTTTGGCTCA TATTTTTCAA AGGTGCCAAG AGTTCACGTT 17100 

CCAACTTGAT TCCATCTAAA ATACGACCGT CTGCTGCTAG TGGGTGACTC CGTCTGGTTT 17160 

• CCTTGTAACG AGCGACCAAT TCCTTATCAG CCGCATCCAA AAAGAGGATT TTGAAATCCA 17220 

AACCATCTTG ATTTTCCAAC TCATCCAAAA CAGCTTGAAT CTCTGAAAAG AAAGAACGGC 17280 

TACGCATATC CACTACCAAG GCCAACTTAG GATTGTCTTC CTTAATTTCA ACCAGCTGCA 17340 

AAAACTTAGG CAAGAGAGCT GGCGGCATAT TATCAATGGT GAAATAACCT AGATCCTCGA 17400 

AGGACTGAAT GGCTACAGTT TTCCCTGCGC CACTCATCCC TGTCACAATC ACCAAGTGAA 17460 

GTTGTTTCTT TGTCATCTTT TTCTCCTTAT ATCAAAAGAA GTTTGGCAAC ACCAAACTTC 17520 

AACTAGCTTA TCCAATCTCT GCGATGACTT CAATTTCGAC TTTTAC ATCA ■ CGAGGAAGAC 17580 

GAGCTACCTC CACAGCTGAA CGAGCTGGGA ATTCCTCTTT GAAGGCCGTT TGGTAAACCT 17640 

CATTAAAAGG AACAAAGTCG TTCATATCGC TCAAGAAGCA AGTTGTTTTG ACAACATGGT 17700 

CAAAGTCTGT TCCTGCTTCT GCCAAAATAG CACCGATGTT TTTCAAGACT TGCTCTGTCT 17760 

GTTCTTGGAT ATTCTCTCCT ACAATTTCCC CAGTTTCAGG GGATAGGGGA ACTTGACCGC 17820 

TAGCAAACAA AAGGTTGCCA ACGATTTTTC CTTGAACATA GGGTCCGATA GCCTTTGGGG 17880 

CCTTATCTGT ATGAATTGTT TTTGCCATTT TCTTTTCCTC ACAATTTTTC TAAGATTGCA 17940 

TCCCAAGCCT CATCCATCCC TGCCTTACTG ACAGATGAAA AGAGGATGAA ATCGTCACTC 18000 

GGGTCAAAGT TTAATTTCTT TTTGATTGCT GATTCATGCT TGTTCCATTT ACCACGAGGA 18060 

ATCTTGTCCG CCTTGGTCGC CACAATGATG ACTGGAATCT CATAATACTT GAGAAATTCG 18120 

TACATCTGCA CATCATCTGC TGACGGGTCA TGACGAAGGT CAACTAGACT GACAACCGCA 18180 

CGGAGATTTT CCCGAGTCGT TAAGTACTCC TCAATCATGC ACCCCCAGTT TTCACGTTCC 18240 

TTTTTAGAAA CACGAGCATA GCCATAACCA GGCACATCCA CAAAGCGCAT CTTGTCATCA 18300 

ATGTTAAAAA AGTTCAGGAG CTGGGTTTTA CCAGGTTTTC CTGATGTACG GGCGAGATTC 18360 

TTACGGTTCA ACATAGTGTT GATAAAGCTG GATTTACCAA CATTTGAACG CCCTGCTAGG 18420 

GCAATCTCTG GCAGTTCATC CTGCGGATAG TGGGACTTAT TAGCTGCACT GAGCAAGATT 18480 

TCAGCATTGT GTGTATTAAG TTCCATAGTC ACCTCTAGGC TGTTTCTAGG ATCGGTTTAT 18540 

CCGTTCCATC TACAGTTTCT TTAGTGATGC GAACCAATTT CACATTTTCC TGACTCGGCA 18600 

CCTCAAACAT GACATCTAGC ATGGTTTCTT CGATGATGGA GCGAAGTCCA CGCGCCCCTG 18660 

TCTTCCGTTC GATTGCTTTA TTAGCAATCT CTTGAAGGGC TTCGTCGTCA AATTCCAACT 18720 
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CAACATCATC ATAAGAAAGC AAGGTTTGGT ATTGTTTCAC CAAGGCATTT CTTGGCTCTT 18780 

TCAAGATGCG AACCAAGTCA TCAACGGTCA ATTGCTCAAG AGCCGCAAAA ACAGGCAAGC 18840 

GTCCAATCAA CTCAGGGATA ATACCAAATT TTTGAATGTC TTCAGCGATG ATTTCTTGCA 18900 

TGTATGAGCT GTTTTCGTCA ATCGCCTTAT TATTTTGACC AAATCCGATG ACTTTTTCAC 18960 

CCAGACGTTG TTTGACAATT TCTTCAATAC CATCAAAAGC ACCACCCACG ATGAAGAGGA 19020 

TATTTTTTGT ATCCACTTGA ATCATCTCTT GTTGTGGATG TTTGCGTCCA CCTTGAGGCG 19080 

GTACGCTAGC AACAGTTCCC TCAATAATCT TGAGAAGGGC TTGTTGCACC CCTTCACCAG 19140 

AAACATCACG TGTGATAGAC ACATTCTCAC TCTTCTTGGC AATCTTGTCA ATTTCATCCA 19200 

CATAGATAAT GCCACGCTCT GCACGTTCGA TGTTAAAGTC AGCAACCTGC AAGAGTTTGA 19260 

GGAGGATATT TTCCACATCC TCACCCACAT AACCAGCCTC CGTCAGAGCT GTCGCATCCG 19320 

CAATAGCAAA AGGTACATTC AAGCTCTTAG CCAAGGTCTG GGCAAGGAAA GTTTTCCCTG 19380 

AACCAGTTGG GCCAATCATC AAAATGTTTG ACTTCTGCAA ATCCACATCT TCTGACTCTT 19440 

CGCGTGTATC GTGGAAATTG ATGCGTTTGT AGTGGTTATA AACCGCCACT GCCAAGGCAC 19500 

GCTTGGCACG ATCTTGACCA ATTACATAGT GGTTCAAGAT ATGGAGGAGT TCAATTGGTT 19560 

TTGGCACCTC AGACAAGTCT GCGAAGACTT CCTCAACCAA TTCTTCTCGA ATGATTTCCT 19620 

GAGCTAACTC CACGCATTCA TTACAAATAA AAGCATTGTT GCCAGCAATT ATTTTTTGTA 19680 

CTTCTTCTTG GTTTTTGCCA CAAAATGAGC AATAAACCAT CATATCATTT TTTCTATTTG 19740 

TAGACATGAT TTCCTTCCAT TCTATACTGT CATTCTATCT AAAATAAGGT CATGTAAAAA 19800 

GCATGAATAC TATTGACCAG ATTGGTAAAG GCATTTAACC AAAGGAGGAT AGAAAGCCCG 19860 

TAACGCTTTT TACGAAAAGC TTGTGCTCCT GCCAGAAAGC AGATGAAACA CAGAAAAGCC 19920 

GTGAATAGAC CAAATAAACT CCGTTCCATT AGACTTCCTT TCTCTTGCGG TATTGGATGG 19980 

TAAAATCATA AGGATTCTTC TCATCTTTGG CGTAAAATTT GCTTGAAACT GTCTCAAAAA 20040 

GAGACAAGTC AAGTTCTTCA GGGAAATAGG TATCTCCTTC CACCCGAGCA TGAATGTGAG 20100 

TGACAATCAC TTCATCAAGG TAAGGTTCAA AAGCCTGAAA AATTTGCTTC CCACCGATAA 20160 

TGTAGAGATT CTTTTCTTGA GCCTGATACC AGTCAAGAAC AGACTGGACG TCCTGAAAAG 20220 

TAGCAACCCC ATCTATCTTT TCTTCCGGAT TACGCGTCAA AATCAAGGTT TCCCGTTTTG 20280 

GAAGCAAGCG ACGCCCCATC CCATCAAAGG TCACACGCCC CATCAAGATA GCATGATTCA 20340 

GAGTTGTTTC TTTAAAGTGC TGCAATTCTG CTGGCAAATG CCAAGGCAGA CGATTTTGCT 20400 

TACCAATCAC ACCCTCTTCA TCCTGGGCCC AAATAGCTAC GATTTTCTTA GTCATGCTTC 20460 
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CATCCTTTTC ACTGATAGTA CTATTTTATC AAAAAACTCA 
AGCTTACAAA ATAGAAAAAA TCTGTAAGAA ATTTCCTACA 
TTTCTTACAA ACCAGGTGCT TGTCCAAGTT CGGCTGCAAG 
TTTCAGTTTT AGCGCCTGCA AAGATACCGT TTGTCACATC 
CATCCAAACC TTTTTGGAAA AGTTCTGACA AGTAACGGTA 
AGCTTTCTTC AACATTACGG TATTCACCAG CTTCTTCTTC 
ACTCTGTCAA TGTAGAGAAT GGGCTTCCAC CGAGTGTAAT 
CCAATTGACC GTCAAGAGCT TCCATGTACT CATCCATTTT 
CACGACCATG CATATACCAG TGCACTTGGT GCAAAGCAAC 
CAACAGCTTG GTTCAAGACT TCCTTTGTTT TTGCCAATGC 
(2> INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2387 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



AAAAAAGACT GGTTTGGAAT 20520 

GATTTATCTA TGTTTCCTTA 20580 

CATCCAAATT GTTTTATCTG 20640 

GTCACCTTCT TCATCAGTGA 20700 

GATAACAAGA ACACGTTCCA 20760 

GATTTCACTA TTTTGAAGGA 20820 

CAAGCGTTCA CTGATTTCAT 20880 

TGGATGCCAT ACAAGGAAAC 20940 

GTGAGCTACA TACAAATCAG 21000 

21040 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
ATTCTTAATA CGATTAAAAG GCTTATTACT AAAAGAAAAT TTCAGTTAGA TGAACTAAAC 
TTGCTCGTCA AATCCCGATT TAACGAGATG TTTGGGGAAA ATAAAATATT TGAAAGCATT 
GATAACTTAT TTGATATTAT AGATGGTGAT AGGGGCAAAA ATTATCCTAA ATCAGATGAG 
TTGTTTAGTG AGGAGTACTG TTTATTTTTA AATACAAAGA ATGTTACTAA AAACGGATTT 
TCATTCGATA CAAAGCAATT TATCACTAAA ACAAAGGATA AATTACTTCG AAAAGGCAAA 
CTTGAGCGTT ATGATATAGT CTTGACAACA AGAGGTACTG TTGGAAATGT AGCGTACTAC 
GATGAATTAA TAAAATATAA ACATTTACGT ATAAATTCAG GTATGGTAAT ATTACGTCCC 
AAGACACCAA ATCTAAATCA GAAATTTATT ATCCATGTTT TAAGGAATAA TAATTATAGT 
CGAGTGATAT CAGGAAGTGC TCAGCCTCAG TTACCAATTA CAAAATTAAA AAAAATACTT 
CTCCCCCTCC CCCCACTAGC CCTCCAAAAT GAGTTCGCAG ACTTTGTAGT CCAGGTCGAC 
AAATCACAAT TGGCAATCCA AAAATCTCTG GAAGAACTTG AAACTTTGAA GAAATCTCTG 
ATGCAGGAGT ATTTTGGCTG ATATTCTGCC ATTGTAATTA CGGTAATGAT TTGTTATAAT 
ACTTCAAAGG AGGAAATCAG ATGGTAGTAA AAACAAGAAA ACAAGGAAAT TCAATCACCA 
TTACGATTCC AAGTGAATTT AATATTCCAA GTGGTGTTAA ATACGAAGCG AAATTGTTAC 



60 
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180 
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300 
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CAAGTGGTGA GATTATCTTT 


ACTCCTGAAG 


AATTGGGGCA 


GCAGGTTTCT TATGTATCTG 


900 


ATGATGCCTT 


TGACTTAAAT 


TTAGATAAAA 


TATTTGACGA 


ATACGACGAT 


GTTTTCAAAG 


960 


CTTTGGTGGA 


AAAATGACAA 


TCTATTTGAC 


AGAAAAGCAA 


ATTGAAAAAA TAAATGCTTT 


1020 


AGCAATTCAA 


CGGTATTCTC 


CAAATGAGAA 


AATTCAAACA 


GTTAGTCCTT 


CTGCCTTAAA 


1080 


TATGATTGTG 


AACTTACCAG 


AACAATTTGT 


CTTTGGGAAG 


CCTCTTTATC 


GAACAATTTT 


1140 


TGATAAAGCA 


ACGATACTAT 


TTGTCCAATT 


GATAAAGAAG 


CATGTTTTTG 


CTAATGCTAA 


1200 


TAAAAGAACT 


GCTTTCTTCG 


TTTTGGTCAA 


ATTTTTACAA 


TTAAACGGCT 


ATCGTTTTTC 


1260 


TGTAACGGTA 


GAAGAAGCAG 


TAAAAATGTG 


TGTAACCATC 


GCAGTAGAAG 


CTTTAACTGA 


1320 


TGAAAAAATG 


ACAAGCTACT 


CCAAATGGAT 


TTCTGAACAT 


TCTGTTAGAG 


AAAAGGTCAA 


1380 


AAAGTAACCT 


AGTATGCTGG 


ATTTGAATGA 


GCACAAGAAA 


ATAAATGAAC 


AGACAATATT 


1440 


AGAATTCTGT 


AATGCAGAAA 


CTGATATTGT 


CTCTTTTTAT 


TGATGAATAA 


GAAAGTGAGA 


1500 


AATTATGGAA 


TCAAAAGTTA 


CAATTATCAT 


GCAAGAAATG 


TTACCTCTTT 


TAAATAATGA 


1560 


ACAATTACTA 


GCGTTGAGAG 


AGAGTTTAGA 


ACATCATCTA 


GTAGACGGAA 


AAAAGCAGCA 


1620 


GAAGTATTCG 


AATAATAACC 


TGTTGCAACT 


ATTTATTACC 


GCCAAGCAGG 


TAGAGGGCTG 


1680 


TAGCTCAAAA 


ACAATTCGTT 


ATTATCAGAG 


GACGATTGAA 


AACTTGTTTA 


ATGCTATTAA 


1740 


AGAGTCTGTG 


ACACAACTCA 


CAACAGATGA 


TTTAAGGAGT 


TATTTAGCAA 


ATTACCAGTC 


1800 


TGAAAAGGAT 


TGTAGTAAGG 


CAAATTTAGA 


CAATATTAGG 


CGTATATTGT 


CTTCTTTTTT 


1860 


TGCTTGGCTT 


GAGCAAGAGG 


ATATATCATT 


AAAATTCCCA 


TTCGACGGAT 


ACAGAAAATT 


1920 


AAGACTGAGC 


AAAATGTGAA 


GGAAACTTAT 


ACTGATGAAC 


ATTTGGAAAT 


TATGCGTGAT 


1980 


AACTGTGAAA 


ATTTGAGAGA 


TTTGGCAATA 


ATAGACCTAG 


TAGCATCGAC 


AGGTATGCGT 


2040 


GTAGGGGAGC 


TTGTACAGTT 


GAATCGTTCA 


GATATTGATT 


TTGAAAACAG 


AGAGTGTGTT 


2100 


GTCTTTGGTA 


AAGGAAAGAA 


GGAGAGACCA 


GTATATTTTG 


ACGCTCGTAC 


GAAAATTCAT 


2160 


TTAAGAAATT 


ATCTTAACGA 


CAGAAAAGAT 


AGTCACCCTG 


CTCTTTTTGT 


AACGCTAGTT 


2220 


GGAAAAGTCC 


AGAGGCTTGG 


AATTGCTGGT 


GTAGAGATTC 


GCTTAAGAAA 


GTTAGGAGAC 


2280 


AAACTCGGCA 


TACAAAAGGT 


TCACCCACAT AAGTTCAGAA 


GAACTTTAGC 


GACTAAGGCA 


2340 


ATTGATAAAG 


GTATGCCTAT 


CGAACAAGTC 


CAAAAACTGC 


TAGGTCA 




2387 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10669 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
ATATTAAAGC GACTTTCTGT GCGCTAGGGA AAAATGTTCC TGGGAATGAG GACTTGGTGA 
AGAGGATAAA ATCTGAAGGT CATGTTGTTG GAAACCATAG CTGGAGCCAT CCGATTCTCT 
CGCAACTCTC TCTTGATGAA GCTAAAAAGC AGATTACTGA TACTGAGGAT GTGCTAACTA 
AAGTGCTGGG TTCTAGTTCT AAACTCATGC GTCCACCTTA TGGTGCTATT ACAGATGATA 
TTCGCAATAG CTTGGATTTG AGCTTTATCA TGTGGGATGT GGATAGTCTG GACTGGAAGA 
GTAAAAATGA AGCATCTATT TTGACAGAAA TTCAGTATCA AGTAGCTAAT GGCTCTATCG 
TTTTGATGCA TGATATTCAC AGTCCGACAG TCAATGCCTT GCCAAGGGTC ATTGAGTATT 
TGAAAAATCA AGGTTATACC TTTGTGACCA TACCAGAGAT GCTCAATACT CGCCTAAAAG 
CTCATGAGCT GTACTATAGT CGTGATGAAT AAGCAAGAAA AAATAGGTCT GTTAGATATT 
TGACAGACTT ATTTTTTACA GAATATAGTA CTACTTAAAA AATGTTTTAT GCTATAATTG 
ATGAATAAAA TAGAAGGAGA AGCATATGAA TACCTATCAA TTAAATAATG GAGTAGAAAT 
TCCAGTATTG GGATTTGGAA CTTTTAAGGC TAAGGATGGA GAAGAAGCCT ATCGTGCAGT 
GTTAGAAGCC TTGAAGGCTG GTTATCGTCA TATTGATACG GCGGCGATTT ATCAGAATGA 
AGAAAGTGTT GGTCAAGCAA TCAAAGATAG CGGAGTTCCA CGTGAAGAAA TGTTCGTAAC 
TACCAAGCTT TGGAATAGTC AGCAAACCTA TGAGCAAACT CGTCAAGCTT TGGAAAAATC 
TATAGAAAAA CTGGGCTTGG ATTATTTGGA TTTGTATTTG ATTCATTGGC CGAACCCAAA 
ACCGCTCAGA GAAAATGACG CATGGAAAAC TCGCAATGCG GAAGTTTGGA GAGCGATGGA 
AGACCTCTAT CAAGAAGGGA AAATCCGTGC TATCGGCGTT AGCAATTTTC TTCCCCATCA 
TTTGGATGCC TTGCTTGAAA CTGCAACTAT CGTTCCTGCG GTCAATCAAG TTCGCTTGGC 
GCCAGGTGTG TATCAAGATC AAGTCGTAGC TTACTGTCGT GAAAAGGGAA TTTTATTGGA 
AGCTTGGGGG CCTTTTGGAC AAGGAGAACT GTTTGATAGC AAGCAAGTCC AAGAAATAGC 
AGCAAATCAC GGAAAATCGG TTGCTCAGAT AGCCTTGGCC TGGAGCTTGG CAGAAGGATT 
TTTACCACTT CCAAAATCTG TCACAACCTC TCGTATTCAA GCTAATCTTG ATTGCTTTGG 
AATTGAACTG AGTCATGAGG AGAGAGAAAC CTTAAAAACG ATTGCTGTTC AATCGGGTGC 
TCCACGAGTT GATGATGTGG ATTTCTAGAA AATCATAAAA AGAATTGTAC ATTATTCTAA 
TTTTTGATAT AATAGTCAGC AGGAAAGAAA GTCTTATGGC GTTCTTCAAG CGAGCTTGGG 
ATAGTGGGAG CCAAGTAGGG CAAAATAAAG GGCTGGCGCT TTCTGTAGTA TTTTCAAAAA 
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CAATGAAGTA ATAAATTAGG GTGGAACCGC GTTTCTGACG CCCCTAGGTT 


AAATCAACCT 


1680 


AGGATTGTCA 


GATGTGGTTC 


TTTTGCTTAT TCAGTCTATT GTGTGAAAGA 


. AAGGAGAGCC 


1740 


GTGGACAACC 


TTTATCTTGT 


AAAAGACGAT AGTCAACTAG CTACATTTCG 


TGATTTTGTA 


1800 


GTAAGAAATA 


CTGAAAAGTT 


GAAAGATTAT CAATCTTTTT TAAAGAATGA 


ACTTGCAGTC 


I860 


TGTGATTTAC 


CGCAAGCTGT 


TATTTGGTCA GATTTTAATG CTGCTACACA 


GATTATTAGG 


1920 


GAAAGTGCTG 


TTCCAACCTA 


TACAAATAAT AGACGAGTGG TTATGACGCC 


TGATTTAGCT 


1980 


GTTTGGAAAG 


AATTGTATTT 


GTATCAGTTG ATGGACTACG AGTGTTCTGA GCAAACTCAA 


2040 


GCAATAGAAA 


GTCACTATCA 


TTCTTTATCT GAAAATTTCC TCTTACAGAT 


TGTAGGACAT 


2100 


GAGTTAGCTC 


ATTGGTCGGA 


CATTTTTTAG ATGATTTTGA TGGTTATGAC 


TCTTATATCT 


2160 


GGTTCGAAGA 


GGGGATGGTT 


GAATATATTA GTCGCAAGTA TTTCTTGACA 


GAAGAGGAAT 


2220 


TTCAAGCGGA 


AAAAATTTGT 


AATCAATCTC TCGTAGAACT TTTTCAGAAG 


AAGTATAGTT . 


2280 


GGCATTCATT 


GAATGATTTT 


GGTTCTTCGA CTTATGATAA GAACTATGCA 


AGTATTTTTT 


2340 


ATGAATACTG 


GCGCAGCTTT 


TTGACAGTAG ATAAGTTGGT AGAAAATTTA 


GGTAGTGTAC 


2400 


AAGCGGTCTT 


AGATTCTTAT 


CATTTATGGG CAAATACAGA AAAAACTTTT 


CCCTTGTTAG 


2460 


ATTGGTTTGT 


TCAGCAGAAA 


TTAATTGAAA AAGAAATATA AAAACTAAAG 


GAGTAAACAA 


2520 


TGTCTAAGAA 


ATTAACATTT 


CACTGCATCA GTGGCAGAGA CCTCCTTACA 


GTCGGGCTGC 


2580 


TCCACGCTCA 


GCACTAGAGT 


GCCTGAGCTA GACGCAGTAC TAACTCGTCT 


TGCCTCGTAT 


2640 


GATCGACGAG GCAGACTCGT GTCGCAAGTA ATTATTTTTT ATTAAGGAGT 


ATTCAATGTC 


2700 


TAAGAAATTA ACATTTCACT GCGTCAGTGG CAGAAACCTC CTTACAGTCG 


GACTGCCCTA 


2760 


CGCTCAGCAC 


TAGAGTGCCT 


GAGCTAGACG CAGTACTAAC TCGTCTTGCC 


TCGTATAATC 


2820 


GACGAGGCAG 


ACTCGTGTCG 


CAAGAAATTA TTTTTTATTA AGGAGTATTC AATGTCTAAG 


2880 


AAATTAACAT 


TTCAAGAAAT 


TATTTTGACT TTGCAACAAT TTTGGAATGA 


CCAAGATTGT 


2940 


ATGCTTATGC 


AGGCTTATGA 


TAATGAAAAA GGTGCGGGGA CAATGAGTCC 


TTACACTTTC 


3000 


CTTCGTGCTA 


TCGGACCTGA 


GCCATGGAAT GCAGCTTATG TAGAGCCATC 


ACGTCGTCCT 


3060 


GCTGACGGTC 


GTTATGGGGA 


AAACCCTAAC CGTCTCTACC AACACCACCA 


ATTCCAGGTG 


3120 


GTCATGAAGC 


CTTCTCCATC 


AAATATCCAA GAACTTTACC TTGAGTCTTT 


GGAAAAATTG 


3180 


GGAATCAATC 


CTTTGGAGCA 


CGATATTCGT TTTGTTGAGG ACAACTGGGA 


AAACCCATCA 


3240 


ACTGGTTCAG 


CTGGTCTTGG 


TTGGGAAGTT TGGCTTGACG GAATGGAAAT 


CACTCAGTTC 


3300 


ACTTATTTCC 


AACAAGTCGG 


TGGATTGGCA ACTGGCCCTG TGACTGCGGA 


AGTTACCTAT 


3360 
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GGTTTGGAGC GCTTGGCTTC TTACATTCAA GAAGTAGACT CTGTCTATGA TATCGAGTGG 3420 

GCTGATGGTG TAAAATACGG AGAAATCTTT ATCCAGCCTG AGTATGAGCA CTCAAAATAT 3480 

TCATTTGAAA TTTCGGACCA AGAAATGTTG CTTGAAAACT TTGATAAGTT TGAAAAAGAA • 3540 

GCTGGTCGTG CATTAGAAGA AGGCTTGGTA CACCCTGCCT ATGACTATGT TCTCAAATGT 3600 

TCACATACCT TTAATCTGCT TGACGCGCGT GGTGCCGTAT CTGTAACAGA GCGTGCAGGC 3660 

TATATCGCTC GTATCCGTAA CTTGGCCCGT GTCGTAGCCA AAACCTTTGT CGCAGAACGC 3720 

AAACGCCTAG GCTACCCACT TTTGGATGAA GAAACAAGAG CTAAACTCCT AGCAGAAGAC 3780 

GCAGAATAAA GAGAGTGACA AATTACGAAA ATGGGCGAAC AGAGTGAGCC CTGAGCCAGT 3840 

TGCCGCAGTG ATGAAGGTAT CCTTAGTGAA ACTAAGGATA CTAGGCAAAA TTGGAGACTT 3900 

TTGGCTCCAA TTTTAGCAAT GAAACAACGA AGTTGGTTGC TTGCGTGCCA ATCACATAAG 3960 

GCAAACTGGA AAATAAAAAG ATACTTTTCG GAGAAAAAAC ATGACAAAAA ACTTATTAGT 4020 

AGAACTCGGT CTTGAAGAAT TACCAGCCTA TGTTGTTACG CCAAGTGAAA AACAACTAGG 4080 

CGAAAAAATG GCAGCCTTCC TCAAGGGAAA ACGCCTGTCT TTTGAAGCCA TTCAAACTTT 4140 

CTCAACACCA CGTCGTTTGG CTGTTCGTGT AACTGGTCTT GCAGACAAAC AGTCTGATTT 4200 

AACAGAAGAT TTCAAGGGTC CAGCAAAGAA AATTGCCTTA GATAGTGATG GAAACTTCAC 4260 

CAAAGCAGCT CAAGGATTTG TCCGTGGGAA AGGTTTGACT GTTGAAGATA TCGAATTCCG 4320 

TGAAATCAAG GGTGAAGAAT ATGTCTATGT CACTAAGGAA GAAATTGGTC AAGCAGTTGA 4380 

AGCCATTGTT CCAGGCATTG TGGATGTCTT GAAGTCACTG ACTTTCCCTG TCAGCATGCA 4440 

CTGGGCGGGA AATAGCTTTG AATACATCCG CCCTGTTCAC ACTTTAACTG TTCTCTTGGA 4500 

TGAGCAAGAG TTTGACTTGG ATTTCCTTGA TATCAAGGGA AGTCGTGTGA GTCGTGGCCA 4560 

TCGTTTTTTG GGACAAGAAA CCAAGATTCA GTCAGCATTG AGCTATGAAG AAGACCTTCG 4620 

TAAGCAGTTT GTAATCGCAG ATCCATGTGA ACGTGAGCAA ATGATTGTTG ACCAAATCAA 4680 

GGAAATTGAG GCAAAACATG GTGTACGTAT CGAAATTGAT GCGGATTTGC TGAATGAAGT 4740 

CTTGAATTTG GTTGAATACC CAACTGCCTT CATGGGAAGT TTTGATGCTA AATACCTTGA 4800 

AGTTCCAGAA GAAGTCTTGG TGACTTCTAT GAAGGAACAC CAGCGTTACT TTGTTGTTCG 4860 

TGATCAAGAT GGAAAACTCT TGCCAAACTT CATTTCTGTT CGTAACGGAA ACGCAGAGCG 4920 

TTTGAAAAAT GTCATCAAAG GAAATGAAAA AGTCTTGGTA GCCCGCTTGG AAGACGGAGA 4980 

ATTCTTCTGG CGTGAAGACC AAAAATTGGT GATTTCAGAT CTTGTTGAAA AATTAAACAA 5040 

TGTCACCTTC CATGAGAAGA TTGGTTCTCT TCGTGAACAC ATGATTCGTA CGGGTCAAAT 5100 

CACTGTACTT TTGGCAGAAA AAGCTAGTTT GTCAGTGGAT GAAACAGTTG ACCTTGCTCG 5160 
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TGCAGCAGCC ATTTACAAGT TTGACTTGTT GACAGGTATG GTTGGTGAAT TTGACGAACT 5220 

CCAAGGAATT ATGGGTGAAA AATACACCCT TCTTGCTGGT GAAACTCCAG CGGTGGCAGC 5280 

TGCTATTCGT GAACACTACA TGCCTACATC AGCTGAAGGA GAACTTCCAG AGAGCAAGGT 5340 

CGGCGCAGTT CTAGCCATTG CAGACAAATT GGATACGATT TTGAGTTTCT TCTCAGTAGG 5400 

ATTGATTCCA TCAGGTTCTA ATGACCCTTA TGCCCTTCGT CGTGCAACTC AAGGTGTGGT 5460 

TCGTATCTTG GATGCCTTTG GTTGGCACAT TGCTATGGAT GAGCTGATTG ATAGCCTTTA 5520 

TGCATTGAAA TTTGACAGTT TGACTTATGA AAATAAAGCA GAGGTTATGG ACTTTATCAA 5580 

GGCTCGTGTT GATAAGATGA TGGGCTCTAC TCCAAAAGAT ATCAAGGAAG CAGTTCTTGC 5640 

AGGTTCAAAC TTTGTTGTGG CAGATATGTT GGAAGCAGCA AGTGCTCTCG TAGAAGTAAG 5700 

CAAGGAAGAA GATTTTAAAC CATCTGTTGA ATCACTTTCT CGTGCCTTTA ACCTGGCCGA 5760 

GAAGGCAGAA GGGGTTGCTA CGGTTGATTC AGCACTATTT GAGAATGACC AAGAAAAAGC 5820 

TTTGGCAGAA GCAGTAGAAA CACTCATTTT ATCAGGACCT GCAAGTCAGC AATTGAAACA 5880 

ACTTTTTGCG CTTAGCCCAG TCATTGATGC TTTCTTTGAA AATACTATGG TAATGGCTGA 5940 

AGATCAGGCT GTCCGTCAAA ATCGTTTGGC AATCTTGTCA CAACTAACCA AGAAAGCAGC 6000 

TAAGTTTGCT TGTTTTAACC AAATTAACAC TAAATAAAAT TTGATAAACG GACTTTATCT 6060 

TATTACAAAG GAGAAGAAAT GGATCCGAAA AAAATTGCTC GTATCAATGA GCTTGCTAAA 6120 

AAGAAAAAAA CAGAAGGCTT AACACCAGAA GAAAAAGTGG AACAAGCCAA ACTACGTGAG 6180 

GAGTACATCG AAGGTTATCG CCGCGCTGTT CGTCACCACA TTGAAGGAAT CAAAATTGTG 6240 

GACGAAGAAG GAAACGATGT TACACCAGAA AAACTACGCC AAGTACAACG TGAAAAAGGA 6300 

TTACATGGCC GTAGTCTTGA TGATCCAAAT TCATAATAAT ACTCTTCGAA AATCAAATTC 6360 

AAACCACGTC AGCTTCACCT TGCCGTACTT AAGTACAGCC TGCGGCTAGC TTCCTAGTTT 6420 

GCTCTTTGAT TTTCATTGAG TATATGTATT CTTTCTTTTA ACAAAGATAG ATGAAACGAT 6480 

^ AACAAAGAGA CTAGCAGTTT GTGTTTGCTA GTC TT TTTTC GCTAAAAAAG GAACCATAAT 6540 

GGTTCCTAAA AACTATCATT AGTAACTTGC ACCGGCTGTA GCGTCTGCGT CACCACCGTG 6600 

GCCTCCAGCA TCCCCTGAAT CAGAAGCGCC AGAAGTAGCA TCGGCGTCTC CATGACCTCC 6660 

GGCAGCAGGA GCAAATGGTC CGCTACCACC CACCAAACGT TGACCAGTCT CTTTTAGGTA 6720 

CCAGTCAAGC CATGGTTGGA AGTTAAAGAC GATTTCATTG ATACCAGCGT ATGATCCATC 6780 

AGGATAGTAC ATTGCTTGGT AGTTGTGAGT GTTGATAACA CCTGCAGGAG AACCTGGAAC 6840 

GATCGTACGG ACGTATTCTT GGTTTCCGTT GCGAAGTGTT CCGATAACCC ACTCTACGTT 6900 
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CTTCATACGT GCTGGTGGAA GAGAACCATG AACAGTCGAC ATACGGCTAC CTGATTGAGG 6960 

TGGTACACGT TTAGCGAACA TAGTGTCTGG ATCTTGGTGA GCGTTGTTGT AGTAGAGGAA 7020 

> TTGGTTGTTG TCGTCAGCGT ATGTCAATTC AAATGGCATA GCTTTCAAGA ACATATCAAT 7080 

TTGGTTAACT GTTAGGATAC CGTGGTCCAA TTTGACATAG GTATCACCAG AAACAGCACC 7140 

AGTGAATGCT GCAACTTTTT CTACCCATTC TGGATCGTCA GGGTCAACTT CTGTGATGGT 7200 

TGTAGCGATT GGTTTTCCAC AATCCAAGTC TTCTGATTCG ATTGGTTTTG GTTTTTTCAA 7260 

TTTCGAAACG ACTCCTACGT ATTTAACAAA GTTATCTAAG CAAGTTTCAA GGAATTTAAC 7320 

AGTGCCTTCG TTGGTGATAT TTCCGTTGTT ATCAAAAGCT TCCTTAGCTT TACCAAGAAG 7380 

GAATTCGTTA CCTGGAAGCG TGTAGGCATT AACACCTGGA GCATCAAGGA TTTTACGAAG 7440 

GTGAACTTGA GCACGTGATG TTCCTTGGTC ATAGTATGAT GCACCCACAA TCATAACAGG 7500 

CTTGTTTTCA AATGGATGAA CTTCGTATGA AAGCCATTCA AGTACAGATT TGAGTGAAGC 7560 

TGAGATAGTG TGGTTATGCT CAGGAGTAGC AATGATAACA CCATCTGCAC GAGTAATTTT 7620 

GTTATATAAA TAACGTAATT GGAAACTTTC ATCCCATTTT TCATCTTGGT TAAACATTGG 7680 

AACTTCGTCA ATTTCAAGAA CTTCTAATTC AAATTTGAGT TTGAAGTAGC GACGGATAAA 7740 

TTCCAAGAGC TTACGGTTAT ATGATTGATC GTAGTTTGAT CCAACAAGTC CAACAAATTT 7800 

CATTCTTTTT GGTCTCCTAT CTTACAAATT TTCCCAGTCA AAGTCTTCAG CATCTTTGCG 7860 

AAGTAATTCT TGTGCATTAC GTAATTTTTC TGTGATTTTT ACAAAGATAC GGAAGTCATC 7?20 

AAAGATGGCA TCCAATTTCT TGATAACATC AAGGTCAACC AAGTCGCCAC TTGGGTTAAA 7980 

TGCTTGAAGA GAGTGTGAGA GCAAGAATTG ATCTGGAAGA ACATTTGCCT TGATTTCAGG 8040 

AGCATTCAAG ATTTGACGAA GTTGCAATTG GGCACGAGAT GAACCAAGCG TACCGTAAGA 8100 

AGCACCTGTA ATCATGATTG GTTTGTTCAA AAGTGGGTAA ATACCATAAG ACAACCAAGC 8160 

AAGAGCGCTC ATCAAAACAG CTGGAATAGA GTGATCATAC TCAGGAGTAC CGATAATAAG 8220 

GCCATCTGCC TCTTCGATTT TAGCAGCAAT TTCCAATATT TCAGCAGGTA CTTGCTTGTC 8280' 

AGCTGGTTTG TTGAAGACAG GAATGGCCTT GATTTCAACA AGTTCAATTT CAGCTTTGTC 8340 

AGTAAAGTGT TTTTGCATGT ATTGAAGCAA TTGACGGTTT GTAGAACGTT TTGAATTTGT 8400 

TCCAACAATA GCAATAAGTT TTAACATGAG ATTTCCTTTC TCTTTTTACA TAATACAATT 8460 

TTAAAATTCC ATTGAAACAG TTGTCTCTAT AGAGTAGGAA TTCCTGAAGA ACAGCTTAGG 8520 

TGGCCTTCTT TATCGATGAG GATGACTTCG ATGCCCTCCA AACTTTCGAC TTGCCAGAGG 8580 

ATAGAAGCAG GTCTTTCTCC AAAGAGTCGA GTCGTCCAGA TTTCGCCATC GACTGATTTA 8640 

TCAGAGATGA TTGTTAGACT CGCTAGTTCC GTTTCAACAG GATATCCTGT TTGACTGTCA 8700 
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AAAATGTGAT 


GGTAATCTTG 


TCCATCGACG GTCAGGTGAC 


GTTCATAAAT 


GCCTGAAGTC 


8760 


ACGACAGATT 


TATTGACAAC AGGGATGGTC ATTAAATGAT TTCCCCTAGG ATTGGCTGGG 


8820 


TCTTGAATCC 


CGATTTGCCA 


TGGGTTATCC CCTCTTGCCT 


GATTTTTTCC 


AATGGTCAGG 


8880 


ATATTCCCTC 


CCAGATTGAT 


CAAGGCAGAA GTCACCCCCT 


CTTTCCTAAG 


AAATTGGGCA 


8940 


ACCTTATCCG 


CACTGTATCC 


TTTGGCTAAA CAACCTAGAT 


CGATCTTCAT 


TCCTTTCTGT 


9000 


TTTAAAAACA CAGTAGAAGT AGAAGAATCT AACTCGATAC CATGAGGATT GATTAGAGGC 


9060 


AGCACCGATT 


CAATTTCTTG 


AGGCTGGGCG ACCTTGGCAT 


CTGAAAAACC 


GATACGCCAG 


9120 


GTTTGAATTA 


AGGGACCAAT 


GCTGATATTG AGGTGGCTAG 


AGAGCGCTAG 


GCTATGCTCT 


9180 


AACCCAAGTG 


AAATCAGCTC 


AAACAGGTCT GGATGAACCG 


TGACGGGGGC 


TATTCCTGCT 


9240 


TGATAATTGA 


TTTCCATCAA 


CTCAGATTCT TGACTATTGG 


CGTTGAAGCG 


GTATTCAAGT 


9300 


TCTTTGAGCA 


AGTCAAAGGA 


TTTTTGGAGA AAGATATCGG 


CTTGCTCATC 


CACTAATGAA 


, 9360 


ATAGTGATAG 


TAGTCCCC AT 


TAGCCGTTCA GAATGTGAAC 


GAAGAGTCAA 


GCTACCAACT 


9420 


CCTTTCTCTT 


ATAGAAAATA 


AGTTGTAATA TCAAATAATC 


ATCTAAATTG 


AAGCCCTTAC 


9480 


ATTTCATTTT 


CATGTTATTA 


TAATACCATA AAGTTAGAAT 


TTTCACAAAC 


AAAATTTGGA 


9540 


AAAAGTCAAG 


AAATATGCTC 


ATAAAATTCA TCAGGCTTGA 


AAACAGGATA 


AATGGGGAAT 


9600 


TATTTTTGAT 


AAAAAATGCT 


GAAATAATAG TACCCCCCTT 


GTAAACGCTA 


ACGGTAAATG 


9660 


GTATACTAGT 


AAGGTAAATT 


TAGAATGAAG GCAGGAAATT 


TTTATGAGTA 


AAATCGTTGT 


9720 


AGTCGGTGCT 


AACCACGCTG 


GTACAGCATG TATCAATACC 


ATGTTGGATA 


ATTTTGGAAA 


9780 


TGAGAACGAA ATTGTTGTAT 


TTGACCAAAA CTCTAACATC 


TCTTTCCTAG 


GATGTGGAAT 


9840 


GGCTCTTTGG 


ATTGGTGAAC 


AAATTGACGG TGCTGAAGGC 


TTGTTCTATT 


CTGATAAAGA 


9900 


AAAATTGGAA GCTAAAGGTG 


CTAAAGTTTA CATGAACTCA 


CCTGTTCTTT 


CAATCGACTA 


9960 


TGATAACAAA 


GTAGTTACAG 


CGGAAGTTGA AGGAAAAGAG 


CACAAAGAAT 


CATACGAA/A 


10020 


ATTGATTTTC 


GCTACAGGCT 


CTACACCAAT CTTGCCACCA 


ATCGAAGGTG 


TTGAAATTGT 


10080 


TAAAGGAAAC 


CGCGAATTTA 


AAGCAACTCT TGAAAACGTA 


CAATTCGTGA 


AATTGTACCA 


10140 


AAATGCTGAA 


GAAGTTATCA 


ATAAACTTTC TGACAAGAGC 


CAACACCTCG 


ACCGTATCGC 


10200 


CGTTGTTGGT GGTGGTTACA TCGGTGTTGA ACTTGCTGAA GCCTTTGAAC GTCTTGGAAA 


10260 


AGAAGTTGTC 


CTTGTTGATA 


TCGTTGATAC TGTCTTGAAC 


GGTTACTATG 


ACAAAGACTT 


10320 


CACACAAATG 


ATGGCGAAGA 


ACTTGGAAGA TCACAACATC 


CGCTTGGCTC 


TAGGTCAAAC 


10380 


TGTTAAAGCA ATCGAAGGTG ACGGTAAAGT TGAACGCTTG ATT AC TG AC A AAGAAAGCTT 


10440 
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TGACGTGGAT ATGGTTATCC TTGCAGTTGG TTTCCGTCCA AACACAGCCC TTGCAGGTGG 
TAAGATCGAA CTCTTCCGCA ACGGTGCCTT CCTTGTAGAC AAGAAACAAG AAACATCTAT 
CCCAGACGTT TACGCTGTTG GTGACTGTGC GACTGTTTAT GACAATGCTC GTAAAGATAC 
AAGCTATATC GCTCTTGCTT CAAATGCTGT GCGCACTGGT AACGTTGGT 
(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10500 
10560 
10620 
10669 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
CGCGCTAATA GATACTTTAT GATAGAATAA AGAACAAGAT TGACAAGTAA GAGGAAACAT 
TATGCAAAAT CAAACACTCA TGCAATACTT TGAATGGTAT CTGCCCCACG ACGGTCAACA 
CTGGACGCGT CTGGCTGAAA ATGCTCCACA CCTAGCTCAT CTGGGGATCA GTCACGTCTG 
GATGCCACCA GCCTTCAAGG CAACCAACGA AAAAGATGTC GGCTATGGGG TCTATGACTT 
ATTTGACTTA GGAGAGTTCA ACCAAAAAGG GACTGTCCGC ACCAAGTATG GTTTCAAAGA 
AGACTATCTT CAAGCCATTC AAGCCCTTAA AGCACAGGGA ATTCAACCTA TGGCCGATGT 
AGTTCTCAAC CACAAGGCTG CTGCCGATCA CAGGGAAGCC TTTCAGGTTA TCGAAGTTGA 
TCCTGTAGAC CGTACAGTTG AACTTGGAGA ACCCTTCACC ATCAATGGCT GGACTAGTTT 
TACCTTCGAT GGTCGCCAAG ATACCTATAA TGGCTTCCAC TGGCATTGGT ACCACTTCAC 
CGGTACAGAC TACGATGCCA AACGCAGTAA ATCTGGGATT TATCTGATCC AAGGGGACAA 
CAAGGGCTGG GCCAACGAGG AATTGGTCGA TAACGAAAAC GGAAACTACG ACTACCTCAT 
GTATGCCGAC CTAGACTTTA AACATCCTGA AGTCATCCAA AACATCTATG ACTGGGCTGA 
TTGGTTCATG GAAACGACTG GTGTAGCTGG TTTCCGTTTG GATGCCGTTA AGCATATTGA 
CTCTTTCTTT ATGCGCAACT TCATCCGCGA TATGAAGGAA AAATACGGTG ACGATTTCTA 
TGTTTTTGGT GAATTTTGGA ACCCAGACAA GGAAGCCAAT CTGGACTATC TCGAAAAAAC 
GGAAGAACAC TTTGACCTTG TCGATGTTCG TCTCCACCAG AATCTCTTTG AAGCCAGTCA 
AGCTGGCGCA AACTATGACC TTCGTGGCAT TTTCACAGAT AGCCTGGTTG AACTCAAGCC 
TGACAAGGCT GTGACTTTTG TCGACAACCA CGATACCCAA CGAGGACAAG CCCTTGAGTC 
TACCGTTGAA GAATGGTTCA AGCCAGCAGC CTATGCCCTC ATTTTGTTAC GCCAAGACGG 
CCTTCCATGT GTCTTTTACG GAGACTACTA TGGGATTTCA GGGCAGTATG CTCAAGAAGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
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TTTCAAAGAA ATCCTTGACC GCCTCCTAGC 


CATCCGAAAA 


GATTTGGCCT 


ATGGAGAACA 


1260 


AAATGACTAC TTTGACCATQ CTAACTGTAT 


CGGTTGGGTA 


CGTTCAGGTG 


CTGAAAATCA 


1320 


ATCCCCAATC 


GCAGTCCTTA 


TCTCAAATGA 


CCAAGAAAAC 


AGCAAGTCAA 


TGTTTGTCGG 


1380 


TCAAGAATGG ACTAATCAAA CCTTTGTAGA 


TTTACTTGGT 


AACCACCAAG 


GTCAAGTTAC 


1440 


AATTGATGAG 


GAAGGTTATG 


GACAATTCCC 


TGTCTCAGCT 


AGATCCGTAA 


GTGTCTGGGC 


1500 


AGTCAATACC 


ATCTAATAGC 


TCATAATAAC 


CAAGCTAGGT 


CCAAGCGGAT 


TTGGCTTTTT 


1560 


TGTATTCACA AAAAGACCTA CCCAAATGGA 


TAGATCTTTA 


CTTGATTACA 


ATTTACCTGC 


1620 


TACTGCATCC 


AACAATTCTT 


GGATCTTAGG 


TTGGTTGCTT 


CCTCCTGCCA 


TGGCCATATC 


16B0 


TGGTTTACCA 


CCACCACGTC 


CATCGATGAT 


TGGTGCTAAT 


TCTTTGACAA 


GGTTTCCTGC 


1740 


ATGAAGGTCT 


TTTGTCTTGC 


TTGCTACAAG 


GACATTGACT 


TTGTCACCGA 


TAGCGGCAAC 


1B00 


TAGGACAAGA 


AGATCAGAGT 


AGTCTTTTTG 


TTTCCAGTTA 


TCTGCAAAAG 


TACGAAGGGC 


1860 


ACCGGCATCG 


GATACAGACA 


CTTGACTAGC 


AATGTAACGA 


TGACCGTTGA 


CTTCCTTAAC 


1920 


ATCTTTGAAG 


ATATCGCCTG 


CGGCTGCAGC 


TGCGGCTTTT 


TCTTTCAACT 


CAGCATTTTC 


1980 


TTTTTGAAGT TGACGAAGTT GTTCTTGAAG 


TCCTTCTACC 


TTGTGAGGTA CTTCCTTGAC 


2040 


TTGAGGTGCT 


TTCAAGGTTG 


CTGCGATAGC 


TTTAAGAGCA 


TCCTCTTGTT 


CACGATAGGC 


2100 


TTCAAAGGCT 


TCCTTACCAG 


TCACTGCCAA 


GATACGGCGA 


GTTCCTGAAC 


CGATTCCTTC 


2160 


TTCTTTGACA 


ATTTTGAAGA 


GACCAATCTC 


AGAAGTGTTG 


TCAACATGAG 


TACCACCACA 


2220 


AAGTTCAATA 


GAGTAGTCAC 


CGATAGTCAC 


GACACGAACT TCCTTGCCGT ATTTCTCACC 


2280 


AAAGAGGGCC 


ATAGCTCCCA 


TTTCTTTAGC 


AGTGTCAATA 


TCCGTTTCAA 


CTGTCTTCAC 


2340 


TTCAAGTGCT 


TCCCAAATTT 


TCTCGTTAAC 


TTGCTGTTCA 


ATCGCACGAA 


GTTCCTCAGC 


2400 


AGTTACTGCT 


TGGAAGTGGG 


TAAAGTCAAA 


GCGAAGGAAT 


TCAACTTCGT 


TAAGAGATCC 


2460 


TGCCTGTGTT 


GCGTGGTTTC 


CAAGGATATT 


GTGAAGGGCA 


GCGTGAAGCA 


AATGAGTCGC 


2520 


AGTGTGGTTT 


TTCATGACAC 


GGTGACGGCG 


ATTGCTATCA 


ATTGCCAAGG 


TATATTCTTG 


2580 


GTTCAAGGCA AGCGGTGCAA GGACTTCAAC 


TGTATGAAGG 


GCTTGACCAT 


TTGGGGCTTT 


2640 


CTGAACATTG 


GTCACAGTAG 


CCACAACCTT 


ACCTGACTCA 


TCCAAGATTT 


GTCCGTAGTC 


2700 


AGCTACCTGT 


CCACCCATTT 


CAGCATAAAA 


TGACGTTTCC 


GCAAAGATAA 


GAGAGGCAGT 


2760 


TCCTTCTGAA 


ACAGCTCCTA 


CTTCTGCATT 


GTCAGCAACG 


ATAGCTACCA 


ATTTAGAAGA 


2820 


CAATTGGCTA GCATTGTAGT 


TGAAGACACT 


TTCTACAGTG 


ATGTTTTGAA 


GAGTTTGATT 


2880 


TTGCATACCC 


ATTGAGCCAC 


CCTTGACAGC 


TGACGCACGC 


GCGCGTTCTT 


GCTGTTCTTT 


2940 
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CATGGCTGCT TCAAAACCTT CACGGTCTAC AGTCATACCA GCTTCTTCAG CGATTTCTTC 3000 

AGTCAATTCA ACTGGGAACC CATAAGTATC ATAGAGTTTG AAGACATCTG AACCAGCGAT 3060 

AACAGATTGA CCTTTTTCTT TCAAGTCTGC TACAATGCCT TGGGCAAAGT GTTGACCTGA 3120 

GTGAAGGGTA CGGGCAAATG ATTCTTCTTC GCTCTTAACG ATTTTCTCAA TAAAGTCACG 3180 

TTTCTCAAGC ACTTCTGGGT AGTAGCTTTC CATGATTTTT CCAACAGTTG GAACCAATTT 3240 

GTAAAGGAAA GGCTCGTTGA TACCCAATTT TTGACCATGC ATAGAAGCAC GACGGAGAAG 3300 

ACGACGAAGA ACATAACCAC GACCTTCATT TCCTGGAAGG GCACCATCAC CGATAGCAAA 3360 

TGAAAGAGAA CGAATGTGGT CTGCGATAAC CTTGAAGCTC ATGTTGTCGC CATCTTGGTC 3420 

ATAAACCTTA CCAGACAATT TCTCGACTTC ACGGATAATC GGCATGAAGA GGTCCGTTTC 3480 

AAAGTTGGTC TTAGCCCCTT GGATAACGGC CACCAAACGC TCCAAACCAG CGCCCGTATC 3540 

AATGTTCTTA TGTGGCAATT CCTTGTATTC GCTACGAGGA ACAGCAGGGT CTGCGTTAAA 3600 

TTGTGACAAA ACGATGTTCC AGATTTCAAT ATAACGGTCG TTTTCAATAT CTTCTGCAAG 3660 

CAGGCGAAGA CCGATATTTT CTGGGTCAAA GGCTTCCCCA CGGTCAAAGA AGATTTCTGT 3720 

ATCTGGTCCA GAAGGTCCCG CACCGATTTC CCAGAAGTTG TCCTCAATTG GAATCAAGTG 3780 

ACTTGGATCC ACTCCCACTT CAATCCAGCG GTTGTAAGAA TCTTTATCGT CTGGATAGTA 3840 

GGTCATGTAA AGTTTTTCAG CAGGGAAATC AAACCATTCA GGGCTTGTCA AAAGCTCATA 3900 

AGCCCAAGTG ATAGCTTCGT CACGGAAGTA ATCCCCGATA GAGAAGTTCC CCAGCATTTC 3960 

AAACATGGTA TGGTGACGCG CGGTCTTCCC TACGTTTTCG ATGTCGTTGG TACGGATAGC 4020 

CTTTTGGGCA TTGGTAATAC GTGGATTTTC AGGGATAATG GTCCCGTCAA AGTATTTCTT 4080 

AAGGGTTGCT ACCCCAGAGT TGATCCACAA AAGAGTTGGG TCATTTACAG GAACCAAACT 4140 

TACTGATGGT TCTACTGAGT GACCTTTGGT CGCCCAGAAA TCAAGCCACA TTTGGCGTAC 4200 

TTGTGCACTA GATAGTTGTT TCATATTGTC TCCTTATTCA CTTGTTTAAT GTGATTGGCT 4260 

TTCCAGCATT TCCACATAGT CAATCGCGAC ACAGAGGGAA ATGACTAGGT CTGCATAAGC 4320 

GTCTTCAAGA ACCGTTACGG TATAGGTAGA AGTCAGATGG AAGAGTTCCT TCTTAATTTC 4380 

CGCAATCAAC TGATCGCGAT CATCCAGCAA TTTGAAATTC AAATCCCAGA TATTGCCCTC 4440 

GATACGAAGA CCTAGATTAT CAAACTCATA CTTATCTCGC CAGAAGGTCA ACTTCTTACG 4500 

AATGACAAAA CTCGAGCCAT CCCGAAGCTG AATTTCAAAA CGAGGAAGCA AGGTCAAGAT 4560 

TTCTTTACTA ATCTCACTGA CTTGTTCACC AGCCGCATCA TAGATGGTAA AGGTTTTAGG 4620 

AATCTTAAAA AATGATCCCT CCACCTGATA GGCAATTTCT CCCCTGTCAT CCTTGATAGC 4680 

GAAGCGTTCG CCTCCAAGAC GAAACTTTTG TTTGACAAGA AATGTTTTCA TCAACACCTC 4740 
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CAAAAATCAA AAGACAAGCT 


CATATCACGA 


AGGGCGAAAA 


ACCGCGGTAC 


CACCTTCATT 


4800 


CAATGAACTT 


GTCATTCTCT 


TGTTCTTATG 


CAATTGTATG 


ATTGAGTAGC 


ATGACTTCCT 


4860 


AGCTTAGATG 


GCTCGCAGCA 


CCGCCATTTC TCTGGACTAA GACAAGTGAA AATCAATTCT 


4920 


CAACTTTCTT 


ATTATAACGT 


TTTTTTAAGC 


TTGCGTCAAC 


TGGAAATGAT 


CTCCGTTGAA 


4980 


TTAGACCAAT 


TCCCTACATC 


TCTGATTACT 


TTTTCAGGAT 


ATATTTTTTC 


TTACTGCCAT 


5040 


TTTTCTTTTT 


ATCCCAAATT 


TTCATATTAC 


TAAACACAGC 


TACTAGAATA 


TTTCCAAATA 


5100 


TAAAGGTGCC 


TATCACCCAA 


TATATGGACT 


CAGTTGTTAG 


GTATTGTCGA 


TCCAAGCCAT 


5160 


CCTTTAAATG 


GAATAGTATA 


GCAGTTTGGT 


TAACAATCAT 


AAAGGTTGGC 


CAGAAACTTT 


5220 


TTTTGAAAAA 


AGTAGACATT 


TTCATTATTT 


GTTGCCGCTT 


TCTGTAAGGT 


TAATACTCAA 


5280 


TAAAAATCAA 


AAAGCAAACT 


AGGAAGCTAG 


CCTCAAGCTG 


TACTTGAGTA 


CGGCAAGGCA 


5340 


ACGCTGACGT 


GGTTTGAAGA 


GTATAGGCTT 


AGTATACTAC 


TAGGCAAGCA 


AATAAACAAA 


5400 


TAAACAACTA 


GAATAGAAAA 


AGATAGGGCT 


CTAAAAACTG 


ACTTCTATTC 


CTTAAAAACG 


5460 


AACCAGCTTG 


ACTGATTCGT 


CTTCTTACGT TTATCTCCTA 


CTTCCGATAC 


ATTTTAAACT 


5520 


GTAGGAAGAG 


GTCGCTATAT 


TTCCCTGTCC 


ATTTATGGTC 


AAATTTCTCA 


TAAACTTGTA 


5580 


GGTGTTTCAT 


GGTTTCAACA 


TCGGGATAGA 


AGGCCTTATC 


TTCCTTTGTT 


TCCTCTGGGA 


5640 


GCAATTCCTT 


CGCTGGTAGG 


TTTGGTGTTG 


AATAGCCGAC 


ATACTCCGCA 


TTTTGGAGAG 


5700 


CATTTTCAGG 


TTTCAACATA 


AAGTTGATAA 


AGGCATAGGC 


TGAGTTTTGG 


TTTTTAACTG 


5760 


TTTTGGGAAT 


GACCATATTG 


TCAAACCAAA 


GATTGCTGGC 


CTCTGTCGGT 


ACCACATAAC 


5820 


GTAGATTTTC 


ATTTTTTTCT 


AACATTTGGC 


TGGCTTCACC 


AGAGAAGGTC 


ACGCCGATTG 


5880 


CAACATTATT 


CTGAATCATA 


TAGCCCTTCA 


TCTCGTCCGC 


AACGATAGCC 


TTGATATTTG 


5940 


GAGTCAGTTT 


GTAGAGCTTA 


TCCACTGTCT 


CTTCCAACTG 


CTGCAGATCC 


TTGGAGTTGA 


6000 


GGCTGTAGCC 


GAGGGAATTG 


AGTCCTAGTC 


CCAGCACCTC 


ACGCGCCCCA 


TCAAAGAGCA 


6060 


TGATAGAATT 


CTTATACTCC 


GGCTTCCAAA 


GGTCATCCCA 


ATGCTCAGGC 


GCTTCATCTA 


6120 


CCATGGTTTC 


GTTGTAGACA 


ATTCCTAAGG 


TTCCCCAGAA 


GTAAGGGATG 


GAGAATTTAT 


6180 


TACCTGGGTC 


AAAGGACTGG 


TTGAGAAACT 


CTGGTCCGAT 


ATTTTCGATT 


CCTTCAATTT 


6240 


TTGAATAATC 


AAGCGGAACC 


AAGAGGTCTT 


CGTCCTTCAT 


CTTGTTAATC 


ATGTATTCAC 


6300 


TTGGAATGGC 


AATATCGTAG 


GTCGTTCCAC 


CCTGCTTTAT 


CTTAGTGTAC 


ATGGCTTCGT 


6360 


TGGAGTCAAA AGTCTCGTAC TGAAGTTGAA TTCCTGTTTC TTCTGTAAAC TGAGTCAAGA 


6420 


GTTCAGGATC 


GATATAGTCT 


CCCCAGTTAT 


AGATAACCAA 


TTTTTGACTA 


TCTCGAGTAT 


6480 
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TGATTTTACT ATCTAAATGA GTCGCAATTC CCCACAAGAC AAGGATAATC GCTGCAATTC 6540 

CTGCTAAAAA TGAATAGATT TTTTTCATGC TTGCTCCTCC TTCTCACGAG AGATAAAGTA 6600 

ATAACCTACA ACTAGGATAA TACTAAAGAG AAAGACTAGA GCAGACAGGG CATTGATTTC 6660 

TAAGGAAATC CCCTTGCGAG CACGAGAGTA AATCTCGACT GATAGGGTTG AAAAGCCATT 6720 

TCCTGTTACA AAGAAGGTCA CGGCAAAGTC ATCTAACGAA TAGGTGAAGG CCATGAAATA 6780 

ACCAGTAATG ATAGACGGAG TCAGGTAAGG AAGCATGATT TCCTTGAACA TCTGAAATTG 6840 

ACTAGCTCCC AAGTCATAGG CCGCATGAAT CATGTCGCCA TTCATTTCCT TGAGTCGAGG 6900 

CAAGACCATC AAGACCACGA TAGGAATGGA GAAGGCCACG TGACTAGATA GAACGGTCAA 6960 

AAAGCCAAGT GAAAACTTGA GTTGGGTAAA GAGAATCAAG AAGCTAGCAC CAATCATAAC 7020 

GTCAGGCGCA ACCATGAGGA TATTATTGAG TGATAGAAAG GCTTCTTGGT ATTTCTTACG 7080 

AGACTGGTAG ATGTAAATGG CACCAAAAGT CCCGATAATG GTCGCTATCA AGGCTGATAG 7140 

GAAGGCCAAG AAAAATGTCT GAGCCAAAAT CAGCATGAGT CTCCCATCTC CAAACATGGT 7200 

TTCAAAGTGA GTCCAGCTAA AACCTGTAAA GCTATTCATA TCATCACCAG CATTAAAGGC 7260 

ATAGCCAATC AAGTAAAAGA TAGGCAGGTA GAGGACCAGA AAGACCAGTC CCAGATAAAG 7320 

GTTGGCAAAT TTTTTCATCG TTCTCTCCTT TCCTTAGTCA CCCACATGGT GATGAACATG 7380 

GTCAGGATGA GAATCACACC GATGGTTGAA CCCATACCAT AGTTGTCATT GGTTAGAAAA 7440 

TTCTGCTCAA TAGCCGTCCC CAAGGTGATA ACGCGTTCCC ACCAATCAAA CGGGTCAGCA 7500 

TGAAGAGACT CAAACTTGGG ATAAAGACCG ACTGAACCCC GG 7542 

(2) INFORMATION FOR SEQ ID NO: 59; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9223 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

AAAACCAAAT TCCGGTATTT TAACCTATGC TGTAAATACC ATGAAGTCTG TCATGACAGA 60 

TCAGGTCTAT AACATTAAGG TTGAGACAGA AAATGGAAAT TATGTTGGTG AAGCTAGCCA 120 

TGTTTTGGTC CTTTTGACAA ATTACTTCGC TGATAAGAAA ATCTTTGAAG AAAACAAGGA 180 

CGGCTATGCC AACATTTTGA TTCTGAAAGA TGCCTCTATA TTCTCCAAAT TATCCGTCAT 240 

TCCTGATTTA TTAAAAGGGG ATGTTGTCGC AAATGATAAT ATCGAGTATA TCAAAGCGCG 300 

TAATATTAAA ATCTCTTCAG ATAGTGAATT GGAGTCAGAT GTTGACGGAG ATAAATCAGA 360 
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TAACCTACCT GTAGAAATCA AAGTCCTAGC TCAGCGAGTA 


GAAGTATTTT 


CAAAACCGAA 


420 


AGAGGATTAG TATATAGAGA AAGCCTTTTT TAAGGCTTTT 


TGTATACTTT 


AAAAGATAGT 


480 


TCCTTTAACA ACGGACATTC CTTGCAAATA GTTTTACAAA 


AATAGTATAC 


TGGATTCATT 


540 


GAGTTTGAAA ACGTTTGCGT AAAATTTGAA TGAATACTTT 


AGGAGACAAA 


TTGATGGAAT 


600 


TGAGTGCTAT TTACCATAGG CCTGAGTCGG AGTATGACTA 


TCTTTATAAG 


GATAAGAAAC 


660 


TCCATATTCG AATTCGAACT AAGAAAGGGG ACATTGAAAG 


CATCAACTTG 


CACTATGGGG 


720 


ACCCTTTTAT CTTTATGGAG GAGTTTTATC AGGATACAAA 


AGAAATGGTC 


AAGATAACTT 


780 


CTGGTACCTT ATTTGACCAT TGGCAGGTTG AAGTGTCAGT 


TGACTTTGCA 


CGTATCCAGT 


840 


ATCTCTTTGA GCTCAGAGAT ACAGAAGGTC AAAATATTTT 


GTATGGCGAT 


AAAGGGTGTG 


900 


TGGAAAATTC TCTAGAAAAT CTTCATGCAA TTGGGAATGG 

i 


ATTTAAGTTG 


CCTTAGCTTC 


960 


ATGAGATTGA TGCCTGCAAG gTTCCTGACT GGGTTTCAAA 


TACGGTATGG 


TATCAGATAT 


1020 


TTCCTGAAAG ATTTGCCAAT GGCAATGCTC TATTAAACCC 


AGAAGGGACT 


TTAGACTGGG 


1080 


ATTCATCTGT CACACCTAAG AGCGATGATT TCTTTGGTGG 


TGATTTACAG 


GGGATTATTG 


1140 


ATCATATGAA TTACTTGCAA GACTTGGGTA TTACTGGACT 


ATATCTTTGT 


CCCATCTTTG 


1200 


AATCTACAAG CAATCACAAG TACAATACGA CAGATTACTT 


TGAAATTGAC 


CGTCATTTTG 


1260 


GAGACAAGGA GACCTTTCGG GAACTGGTGG ATCAAGGGCA 


TCATCGTGGC 


ATGAAAGTCA 


1320 


TGCTGGATGC GGTATTTAAT CATATTGGTT CGCAATCTCT 


TCAATGGAAA 


AATGTCGTCA 


1380 


AAAATGGTGA ACAGTCTGCT TATAAGGATT GGTTCCATAT 


TCAACAATTC 


CCAGTGACAA 


1440 


CTGAAAAGCT AGTTAATAAG AGAGACTTAC CCTATCATCT 


TTTTGGTTTG 


GAGGACTATA 


1500 


TGCCTAAGCT AAATACAGCC AATCCAGAGG TCAAGAATTA 


TCTTTTAAAG 


GTTGCGACTT . 


1560 


ATTGGATTGA AGAGTTTAAT ATCGATGCTT GGCGTTTGGA 


TGTGGCTAAT 


GAGATTGACC 


1620 


ATCAGTTCTG GAAGGATTTT CGTAAGGCAG TTTTAGCTAA 


AAATCCTGAT 


CTTTATATCC 


1680 


TAGGAGAAGT CTGGCATACA TCTCAGCCTT GGCTAAATGG 


AGATGAGTTC 


CATGCCGTCA 


1740 


TGAATTATCC TTTATCTGAT AGTATCAAGG ACTATTTCTT ACGAGGAATT 


AAGAAGACAG 


1800 


ACCAGTTCAT CGATGAAATC AATGGAGAGT CTATGTATTA 


CAAGCAGCAG 


ATTTCAGAGG 


1860 


TCATGTTTAA TCTCTTGGAT TCACATGATA CAGAGCGAAT 


CCTGTGGACG 


GCCAATGAAG 


1920 


ATGTTCAACT GGTTAAATCA GCCTTAGCCT TTCTCTTTTT 


ACAAAAAGGA 


ACACGGTGCA 


1980 


TTTATTACGG AACCGAGCTA GCCTTGACTG GAGGACCAGA TCCAGATTGT 


CGTCGTTGTA 


2040 


TGCCTTGGGA ACGTGTATCA AGTGACAATG ATATGCTGAA CTTTATGAAG 


AGGCTGATTA 


2100 
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AAATTCGGAA ATACGCGTCA GTAATCATTT CGCATGGCAA GTATAGCCTT CAAGAAATCA 2150 

ACTCTGATCT AGTAGCTCTG GAATGGAAAT ACGAAGGACG GATCCTCAAA GCAATATTCA 2220 

ACCAATCAAC AGAAGATTAT CTTTTAGAGA AAGAAGCAGT AGCACTAGCA AGCAATTGCC 2280 

AAGAATTGGA TAATCAGCTT GTCATCTCTC CAGATGGATT TATGATTTTC TAAAAACTAG 2340 

TTGATGAAGA TTATGGTACA TTTCATACCT TATATAGTAT AATAAGGCTA GTTACTAAAC 2400 

TTGTAAAGGA GAACTTAAAT GAATTGTAGA GGACATGAAA CAAGACAAAG AATTGTTAGA 2460 

GATTTTGAAG TTCAGCCTAA AGCACATATT AAGCTGTTAG CAAATCAACA AAAACATAGT 2520 

GATGCAGGAG CAACTATTGA AGATGAATAT TATGTATTTA TCGCTGAGAG TAAAATTGAT 2580 

GGCAAGAAGG AAGTTATTCA GTGTTGCATG GGTGCGGCAA GGGATTTTTT AGAACTAATT 2640 

AATCACAAAG GGCTACCTCT TTTTAATCCG CTTGTAGGTG ATTCTCATGT AAATAATAGA 2700 

CAAGAATATG ACAATACAGG GAGTGGAAAT TTATAACCTG AAAAGTGGAA TGAAACTGCA 2760 

AAGCAGCTTT ATAATGCTAT AATGTGGTTG ATTATTTTAT GGAATGCTAA GCCGGATACA 2820 

CCTTTATTTA ATTTTAAAGA CGAAGTAATT AAGTATAAAA CATATGAGCC TTTTGAAAGC 2880 

AGTATAAAAA GAGTAAATAC TACTATAAAG AATGGTAGTA AAGGGAAAAC TCTGACTGAG 2940 

ATGATTAATG GCTACAGAGC GGATAACGAT ATTAGAGATG AAATTTGTAA CTTTAATATT 3000 

CTGAAAAATA AAATTCGTGA TATGAAAAAC CAACAAGGAA ATACAATGGA ATCTTACTTT 3060 

TAGTTATTGT TGAATTTTGG GTATTCTATA AAATATCCTA ATTGAGATTT AAATAGTAGA 3120 

CTATACAATA TAGTTAAAAT ATCAGTAAAA ACAACACTTT ATTGAGGTAT TGGATACGCT 3180 

TTGCTAATAG CCTAATAATC ACATGTGGAG TGTTGCTACA ACGAAAAAGG TGATAATCCT 3240 

TGATTTCAAG CTATTTTATA AGCATTTTGT CTTTGTAGAT AAAGGCAATT TTGACAATAA 3300 

AAATCCTAAA AGGTGAATCG TTATAGATGT ATTTGTAGAT ATCGTTTGCG CATCGAAAAA 3360 

ATTAATACAA GAATAAATAT TTATAGCTCT TTAGGTGACT TTTATAGAAG TAAAGTTTAG 3420 

GATAGAAAAA CAAGAAATAA CGCACCATTT TTGGTGCGTT ATGCTTTTTT ATGCTATAAT 3480 

GGATTTATAA AAATAAAGGA GTTTGCTATG ATTGGAAAGA ACATAAAATC CTTGCGTAAA 3540 

ACACATGACT TAACACAACT CGAATTTGCA CGGATTGTAG GTATTTCACG AAATAGTCTG 3600 

AGTCGTTATG AAAATGGAAC GAGTTCAGTC TCTACCGAAT TAATAGAGAT CATTTGTCAG 3 660 

AAGTTTAATG TATCTTATGT CGATATTGTA GGAGAAGATA AAATGCTCAA TCCTGTTGAA 3720 

GATTATGAAT TGACTTTAAA AATTGAAATT GTGAAAGAAA GAGGTGCTAA TCTATTATCT 3780 

CGACTCTATC GTTATCAAGA TAGTCAGGGA ATTAGCATTG ATGATGAGTC TAATCCTTGG 3840 

ATTTTAATGA GTGATGATCT ATCTGATTTG ATTCATACGA ATATCTATCT AGTAGAAACT 3900 
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TTTGATGAAA 


TAGAGAGATA 


TAGTGGCTAT 


TTGGATGGAA 


TTGAACGTAT 


GTTAGAGATA 


3960 


TCTGAAAAAC 


GGATGGTGGC 


CTAATGGAAA 


TCCAAGATTA 


TACTGATAGT 


GAATTCAAAC 


4020 


ATGCTTTAGC 


AAGGAATCTT 


CGTTCACTGA 


CAAGAGGAAA 


AAAGTCCAGT 


AAGCAACCTA 


4080 


TAGCGATTTT 


GCTTGGAGGG 


CAAAGTGGTG 


CCGGTAAGAC 


TACAATTCAT 


GGTATTAAAC 


4140 


AGAAAGAATT 


TCAAGGAAAT 


ATTGTTATCA 


TAGATGGTGA 


TAGTTTTCGT 


TCTCAGCATC 


4200 


CACACTATTT 


AGAACTGCAG 


CAAGAATATG 


GCAAAGACAG 


TGTAGAATAT 


ACCAAAGATT 


' 4260 


TTGCAGGAAA 


AATGGTAGAG 


TCTTTAGTAA 


CAAAATTGAG 


TAGTTTGAGA 


TACAATCTTT 


4320 


TGATAGAGGG 


AACTTTACGA 


ACAGTTGATG 


TTCCAAAGAA 


AACAGCACAA 


CTCTTGAAAA 


4380 


ATAAGGGATA 


TGAAGTACAA 


TTGGCCTTAA 


TTGCGACAAA 


GCCTGAATTG 


TCGTATCTAA 


4440 


GTACTCTTAT 


CCGTTATGAA 


GAACTGTACA 


TTATCAATCC 


AAATCAAGCA 


CGCGCAACTC 


4500 


CAAAAGAACA 


TCATGATTTC 


ATTGTAAATC 


ATCTAGTTGA 


TAACACACGA 


AAATTGGAAG 


4560 


AACTAGCTAT 


CTTTGAAAGA 


ATTCAAATTT 


ACCAACGAGA 


TAGAAGTTGT 


GTATATGATT 


4620 


CAAAAGAAAA 


TACAACTTCA 


GCAGCAGATG 


TTCTTCAAGA 


GTTACTCTTT 


GGGGAGTGGA 


4680 


GTCAGGTAGA 


GAAGGAGATG 


TTGCAGGTGG 


GGGAAAAGAG 


ACTTAATGAA 


TTACTTGAAA 


4740 


AATAAACAAT 


TGATATTTTT 


AGGAGAATAG 


AAATGAGAGG 


GTTTAATAAC 


AAGATAAAGT 


4800 


CTGTTTATCA 


AGAACTAACA 


AATTCCAAAG 


AGAAATTCGG 


TAGCTTTCAC 


AAGACTTTAA 


4860 


TTCATTTGCA 


TACACCTGTT 


TCTTATGATT 


ACAAGCTATT 


TTCTAATTGG 


ACTGCAACGA 


4920 


AATATAGAAA AATTACTGAA 


GATGAACTAT 


ATGATATATT 


TTTTGAAAAT 


AAGAAAATAA 


4980 


AAGTTGATAA 


GACAATTTTT 


TTTAGTAATT 


TTGATAAGGT 


TGTTTTTTCT 


AGTTCAAAAG 


5040 


AATATATTAG 


TTTTCTTATG 


TTAGCAGAGG 


CAATCATAAA 


AAATGGAATA 


GAAATAGTTG 


5100 


TAGTAACTGA 


TCATAATACT 


ACCAAAGGTA 


TTAAAAAGTT 


ACAAATGGCA 


GTCTCAATCA 


5160 


TAATGAAAAA 


TTATCCGATT 


TATGATATAC 


ATCCTCATAT 


TTTACATGGA 


GTAGAAATTA 


5220 


GTGCAGCAGA 


TAAATTGCAT 


ATTGTATGTA 


TATATGATTA 


TGAACAAGAA 


TCATGGGTTA 


5280 


ATCAATGGTT 


AAGTGAAAAT 


ATTATAAGTG 


AGAAAGATGG 


AAGTTATCAA 


CATTCACTGA 


5340 


CTATAATGAA 


GGATTTCAAT 


AATCAAAAAA 


TAGTTAACTA 


TATTGCTCAT 


TTCAATAGTT 


5400 


ATGACATTTT 


GAAAAAAGGT 


TCTCACTTAT 


CAGGTGCATA 


TAAACGAAAA ATTTTTTCTA 


5460 


AAGAAAATAC ACGATTTTGG 


AGTTTAATAT 


TAACTCGAAA 


GAATCTTCGC 


AACAACTTGA 


5520 


TATTCTCTAT 


AAAGAAGTTG 


GTGTATTAAG 


TTTGGGACAA 


AAAGTTGTAG 


CCATGCTTGA 


5580 


TTTTTTATTA GCATATAGTG 


ATTATTCTAA 


AGACTTCAGA 


CCATTGATTA 


TTGATCAGCC 


5640 
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TGAAGACAAT CTAGACAATC GTTATATTTA CAGGCATTTA GTTCAGCAGT TTAGAGATGT 5700 

GAAAGCTCAA CGTCAAATTA TTTTAGCAAC ACAT AATGCT ACAATTGTAA CAAATTCTAT 5760 

GACAGATCAA GTTGTTATTA TGGAGTCAGA TGGAGTTAAC GGATGGATTG AATCACAGGG 5820 

ATATGTTAGT GAAAAATATA TAAAAAATCA TATCATCAAT CAATTAGAGG GAGGAAAAGA 5880 

TTCCTTCAAG CATAAAATGT CTATATATGA GACGGCTTTA TCAGAGTAGA GTCAGAAAAA 5940 

GTAGGTTAGA AATTTAGCCT ACTTTTTTCT TTGTCCGACA GGCATAGTGT ACATCTGAGG 6000 

TCCAAGTCCT CTGTGGATAT TTGCTGCAGA TGAAACCAAT AGCGACTCCT AAGCCTGAAT 6060 

ATCGTGAGGT AGGGGGGATA GGAAGGAATT AGCGAAATCA AGGTTCTACA AACAGAATCG 6120 

TGACTTGAAG C CAT AT AT AG CGGATGAGGA ACTCTAAAAT CCAAATAGGT GTCGTAACCT 6180 

ATATACGTAA ATTACGAGAG TAAACTAGGA AAGATGTACG GCTTATTCCG TGAGCGTTTA 6240 

GGACGTAGTA CAACGAATCA TGGGAGTCAG CTGAACACAT AGTATTGAAG AAATTTCTGT 6300 

AATGGAAATG GAGCGAAGAA GTGAACAATT AAATGAATAC CTCTCTAATT AAATTTGTCA 6360 

ATTCTAATTC CTGGTATGAA AAGACAGTGA CCTGAAAATG TAAACGATGG GAGCTGATCA 6420 

TAAATATAGG ACGGTACATG CAGTGGTGTT AGAGATTAGT CCTTACTTGA TTTGTGATAA 6480 

CTTCCCCAAA TTTCTTCTGC TATACTTTTC TCAACTTTTA AAAATCCAAC TAAGAATTTT 6540 

ACCTGGGGGT TTGGGGGCGG AGCACTAAGT TATCTTATCG TTAGCTGTCA AAACTGGTAG 6600 

GTTTTGATAG GCTGGCGATA TGATTTTTGG GATATTGTGG ACACAATATC TGAGCTCGCA 6660 

AAGCCTTACA AGAATGAAAA TCAGTTGTTG GAAAAGTGTA CTGACATTGT ATGGTAGCTC 6720 

ACATTGTCAG TACAAGTATT TTGGAAAGGA AGTAGCAGTA TGAAACGAGA TGTGCGTGAT 6780 

ATTCGGAAAC AATTTCGTTT AACAGAAGCA GAAGAAAAGC AAATTCTAGC TTTGATGAGA 6840 

GAGCGGGGAG AGACTAATTT CTCTGATTTT CTTCGTAAAA GTTTACTTTC CTCTGATTTA 6900 

CAAAAACAGA TGGAGACATG GTTTGCCCTC TGGCAATCCC AAAAACTAGA ACAAATCAGT 6960 

CGTGACGTTC ATGAAGTTTT AATCTTGGCA CAGTCAGAAC GTCAAGTCAC CCAAGAGCAT 7020 

GTATCTATTC TCTTAACGTG CGTGCAGGAA TTGATTCAAG AGGTTGCAAA CACCATACCC 7080 

CTCAGTAAAG AATTTCGTGA GAAGTACATG AGGTAAGCAC ATGGAACATC GTTACCGAAC 7140 

CAATCTCAAG AAAGTGTTTT TGTCTGATAG TGAGTTGAAC CAACTAAATA TAAATATCGA 7200 

TCAAAGTGGT TGTAAATCCT TTTCTGAATA TGCGAGACGA ACTCTACTCG ATCCTGGTAT 7260 

GAATTTTATC ACGATTGACA CAAACGGTTA CCAAGATTTA GTGTTTGAGT TAAAGAGGAT 7320 

TGGCAATAAT ATCAACCAGA TTGCTCGAAG TGTTAATCAA TCTCAGTTAA TTTCTGGTGA 7380 

AGAATTGCAG GAGTTGAAAA AAGGAATTGG TGAATTGATA AAAGAAGTTG ATAAGGAATT 7440 
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TAATCTGCAA 


GCGCAGAAGC 


TAAAGGAGTT 


CCATGGTCAT 


CACTAAACAC 


TTTGCCATTC 


7500 


ACGGAAAGAG 


TTACCGCAGA 


AAGCTTATCA 


AGTACATTCT 


CAATCCTGAG 


AAAACCAATA 


7560 


ATCTTGCCTT 


GGTGTCGGAC 


TATGGCATGA 


AGAATTTTCT 


GGACTTTCCT 


AGCTATGAGG 


7620 


AAATGGTGCA 


GATGTATCAT 


GAAAATTTCA 


TCAGCAACGA 


TACGCTTTAC 


GATTTTCGGG 


7680 


ACGACAGGAT 


GGAAGAAAAT 


CAACGAAAAA 


TACACGCTCA 


GCACATCATT 


CAGTCTTTCT 


7740 


CGCCAGAGGA 


TCATATCACT 


CCTGAACAAA 


TCAATCGGAT 


AGGTTATGAG 


ACTGTGAAGG 


7800 


AATTAACTGG 


TGGCAAATTT 


CGTTTTATCG 


TTGCGACCCA TGTTGATAAA GACCACCTGC 


7860 


ACAATCACAT 


CATTATCAAT 


TCAGTAGATA 


GCAATTCTGA 


CAAAAAGGTC 


AAGTGGGACT 


7920 


ACAAGGTGGA 


GCGAAATCTT 


CGCATGATTT 


CTGACCGTTT 


TTCTAAAATC 


GCAGGTGCTA 


7980 


AAATCATTGA 


GAACCGCTAT 


TCTCACCAGC 


GGTATGAAGT 


CTATCGTAAG 


ACTAATCACA 


8040 


AGTATGAACT 


CAAGCAGCGA 


CTCTATTTTT 


TGATGGAACA 


TTCTAGGGAC 


TTTGAGGATT 


8100 


TCAAAAAGAA 


TGCTCCGCTA 


CTACATGTGG 


AGATGGATTT 


CCGTCACAAG 


CATGCCACCT 


8160 


TTTTTATTAC 


GGACTCAACT 


ATGAAACAGG 


TGGTGCGTGG CAAGCAACTC AATCGCAAGC 1 


8220 


AGCCTTACAC 


AGAAGAATTT 


TTTAAGAACT 


ACTTTGCCAA AAGAGAAATA GAAAGTCTCA 


8280 


TGGAATTTTT 


ATTGCTGAAA 


GTTGAGAATA 


TGGATGATTT 


ACTTCAGAAA GCAAAACTTT 


8340 


TTGGACTAAC 


TATCAATCCT 


AAACAAAAGC 


ATGTTTCTTT 


TCAATTTGCA 


GGAGTGGAGG 


8400 


TAAAGGAGAC 


AGAGCTAGAC 


CAGAAAAATC 


TTTATGATGT 


AGAGTTTTTC 


CAAGATTATT 


8460 


TTAAAAATAG 


AAAAGATTGG 


CAAGCTCCAG 


AAACTGAGGA 


TTTCGTTCAA 


CTTTATCAAG 


8520 


AAGAAAAGTT 


ATCCAAAGAA 


AAAGAACTTC 


CAAGCGATGA 


GAAGTTCTGG 


GAGTCCTATC 


8580 


AAGAGTTCAA 


GAGTAACAGA 


GATGCCGTTC 


ATGAATTTGA 


GGTGGAGTTG 


TCACTCAATC 


8640 


AAATTGAAAA 


AGTAGTGGAT 


GATGGAATTT 


ACGTCAAGGT 


CAAGTTTGGT 


ATTCGTCAGG 


8700 


AGGGACTTAT 


CTTTGTGCCG 


AACATGCAGC 


TTGATATGGA 


AGAGGATAAG 


GTGAAGGTTT 


8760 


TCATCAGGGA 


AACCAGCTCC 


TACTATGTCT 


ACCACAAAGA 


CGCTGCCGAG 


AAAAATTGTT 


8820 


ATATGAAAGG 


TCGAACCTTA 


ATTAGACAGT 


TCAGCTATGA 


AAATCAAACC 


ATTCCATTAC 


8880 


GCAGAAAAGC 


GACAGTCGAT 


ATGATTAAAG 


AGAAGATTGC 


GGAAGTGGAT 


GCTTTGATTG 


8940 


AACTGGAAGT 


AGAAAATCAA 


TCTTATGTCA 


CGATTAAAGA 


TGAGTTAGTG 


CATGAACTAG 


9000 


CAGCGTCTGA 


ATTGAGAATC 


AATGAGTTGC 


AAGAACGAAT 


GTCAACCTTG 


AATCAAGTAG 


9060 


CAGAATATCT 


ACTGGCTTCA 


GTTGAAAGTA 


AGCAAGAAAT 


GAAATTAAAT 


CTTTCAAAAG 


9120 


TGAATATAAC 


TGAGAATATC 


AGTGCTAATA 


TTGTTGAGAA 


AAAATTGAAG 


AGCCTGGGGA 


9180 



WO 98/18931 



PCTYUS97/19588 



520 

ATCAACTGGA ATTGGAAAGG GGCAGGTATG AAAAGATGGT AGT 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6827 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



9223 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
TCTGCTGGCT ACCATCATCT GACTTGGGCA AGACCAAAGT CTTAGTTACA ACTGTATTCT 
TCTCAGCATT TTCAATAACT GGCAATGCCG ACTGAAGCGT ATCTTTTTCT GTTTTTGTAG 
CTGGTCCAGT TTCTTTTTTC TGTCCGCAAC CAACCAGGAC AAAAAGGAAA GCTAGACTAA 
CAAGAACTAT TTTTTTCATT TCTTTCTTCT TTCTTTTTGA AATTAAAATA GAATAAGACT 
GGGAAGTGCT CCCAGCCTTG ATGTTTATAG AGCTGCACGC AAACGTGCTT CTGCATTTTC 
TACATTACGG ACAGAGCGTG GTAGGAAGGC ACGAATATCG TCTTCCTTGT AGCCAACTTG 
CAGGCGTTTT TCATCTACAA GGATTGGGCT CTTTAAAATT CTCGGTGTTT CCATAATCAG 
ATTGAGAACT TCATTGACAC TCAAATCTTC AATATCCACT CCAAGGGCTT TGGCATAGCG 
ATTTTTAGAC GAAACGATGC TGGCTATTCC GTTATCTGTT TTGGTTAGAA TATCCAGTAA 
TTCTTCTCTC GTAATTCCTT CTTTACCAAG GTTTTGTTCT TTATAACTTA ACTGGTGGGC 
ATTGAGCCAG GTTTTTGCTT TTTTACAGCT AGTACAACTT GAGACTGTAT AAATTTTAAT 
CATGTACCTA CCCCTTTCGC TACATGTTAC TATCAGTTTA GTCTATTATA CCATAAAAAA 
CATCCGACTT GCGACCTATT TTTAATTTTT TTTGACTTTT TTCGTCATTT TCGTACTTTT 
TTCTTGACAA ACAACTAAAT GACTATCAAC TCTTTTGGAG CTAGGGTCAA TAATTCACAA 
CCTGTCTCTG TAATCAGGAT ATCATCCTCG ATACGAACGC CATATTTGCC TTCGATATAG 
ATACCTGGTT CATCGGTCAA GGCCATACCT GTCTTAATAG TTTCTGTAGA AGTCTGACTA 
AAGTAGGGTT CCTCATGGAT ATCCAGACCA ATACCGTGGC CAATGCCGTG AGTAAAGTAG 
TCACCATAAC CTGCCTCAAT GATAATATCA CGAGGGATTT TGTCAAAGTC ACGGAAACCT 
AAGCCTGCCT TAGCTTGGTC AATCAAGGCT TGGTTAGCTT TTAGAACCGT ATTGTAAATC 
TCTGCCTGCT CATCGCTAAC ATGCCCTAGA TAGATAGTCC GGGTCATATC ACTGACATAG 
TGGTCATAGA GACAGCCGAA GTCCATGGTG ATGGCTTCTC CCAACTCCAC TGGTTTGTGC 
ATTGGATGGG CATGGGGTTT AGAAGAATTG ATACCGCTAG CTAGGATCGT ATCAAAAGAT 
AAGCCAGATG CTCCCAACTC ACGCATGCGG AAATCAAGGA AGTTGGCAAT CTCAATTTCA 
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GTTTTTCCTG GTTTGATAAA GTCAAGCGCA TCGCGGAAAG CTTGGTCTGA GATAGAACAA 1440 

GCCTTGCGAA TCGCTGCAAT CTCTGCCTCA TCCTTAATCA TACGAAGACC TTCCACAAAC 1500 

TGAGTTTGTG GAAGCAAGTT CAAACCTGCA AAAGCTGCCT GCATACGGTG GTAATAAGAC 1560 

ACTGAAATCT CATCTTCAAA ACCGATACGA GTCAAGCCCA TGTCCTTAAC AATTCCTGCA 1620 

ATGACAGCCA ATTCATCACG ATCAGCCACA ATCTCAAAAC CACTGGTTTC TTGCTTAGCT 1680 

GCGATGATAT AGCGAGAGTC TGTCACTAAG ACCTGACGGT CACGACTGAT AAAGACTGTT 1740 

CCGTTTGAGC CCCAAAAACC AGTCAAATAA TAGACGTTTT TAAGATTGTT GATGATGATA 1800 

CCATCTAGTT CTTTTTCTTG CATTTTAGCT AGAAATGCTT GTACGGGTTT ATTCATGATG 1860 

TAACTTTCCT TTCAAATAGT GTCCTGTATA GCTGGCTTCG TTGGCAGCTA CTTCTTCTGG 1920 

AGTTCCTGTT ACGATGATGG TTCCACCACC GACACCGCCC TCAGGTCCCA AGTCAATGAT 1980 

ATGGTCTGCC GTCTTGATAA CATCCAGATT GTGCTCGATG ACGAGGACTG TATTGCGATC 2040 

GTCTAGAAAG CGAGCTAAAA CCTTGAGCAG GCGAGCAATG TCCTCTGTAT GAAGCCCTGT 2100 

CGTCGGCTCA TCCAGAATGT AGAAAGATTT TCCTGTCGAT CGTTTGTGGA GTTCGCTAGC 2160 

TAACTTCATA CGTTGGGCTT CTCCCCCAGA AAGGGTGGTA GCTGGCTGTC CCAAGGTCAC 2220 

ATAGCCTAGC CCTACATCCT TGATGGTCTG GAGTTTGCGT TGAATTTTCG GAATGTGTTG 2280 

GAAAAATTCT ACCGCATCGT TGACCGTCAT ATCCAAGACC TGCGAAATAT TCTTTTCCTT 2340 

GTAGTGAACT TCTAGGGTTT CACTGTTATA GCGGGTTCCG TGGCAAACTT CACAAGCCAC 2400 

ATAAACATCT GGCAAGAAGT GCATCTCAAT CTTGATAATC CCGTCACCTG AGCAAGCTTC 2460 

ACAGCGACCT CCCTTGACGT TGAAACTGAA GCGCCCCTTC TTGTAGCCTC GAATCTTGGC 2520 

TTCATTTGTC TGAGCAAAAA GGTCACGTAT ATCGTCAAAA ACTCCTGTAT AGGTAGCTGG 2580 

GTTAGACCTC GGCGTCCGTC CGATAGGGCT CTGGTCAATA TCAATCAAAC GGTCGACATG 2640 

CTCAATCCCT GTAATAGTCT TAAACTTACC AGGTTTGTCT GAATTACGGT TGAGCTTCTG 2700 

GGCAATGGCT TTTTTGAGAA TGCTGTTGAT TAGAGTCGAT TTCCCTGAAC CCGACACACG 2760 

TGTCACTGCG ATAAATTTTC CTAGTGGAAA GCGAGCCGTG ACATTTTGCA AGTTGTTCTC 2820 

ACGCGCTCCT ATCACTTCAA TAAAACGACC ATTTCCGACA CGGCGCTCTT CTGGTACTGG 2880 

GATGACACGT TTGCCTGACA AGTACTGACC TGTGATAGAC TTGCTGTTGC GAGCCACTTG 2940 

CTTAGGTGTA CCTGCTGCAA CAATCTCACC ACCAAAAACA CCGGCACCAG GACCAACGTC 3000. 

AATCAGATAA TCAGCCTCAC GCATGGTATC TTCGTCGTGT TCCAGCACGA TAAGAGTATT 3060 

GCCCAAGTCA CGCATCTTTT TCAGACTGGC AATCAGGCGA TCATTGTCGC TCTGGTGAAG 3120 
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nLLVjftl luAL 


L>LiL 1 UUTvJTA GGATATAGAG 


GACACCTGAT 


AGGTTGGAAC 


CAATCTGGGT 


3180 


1 uU_AAAL(jA 


ATGCGCTGAC TTTCCCCACC 


TGAAAGGGTT 


CCTGCTGAAC 


GTGACAGGGT 


3240 


1 AbAl A\3 1 1 A 


AGACCCACAT TATTAAGGAA 


GGTCAAACGA 


TCCTTGATTT 


CCTTGAGAAT 


3300 


Vj*j<o A\_Vi Aks LA 


ATGATGGCTT CATTTTCAGA 


CAAAGTTAAC 


TGGCTCACCA 


AGTCCAAGTG 


3360 


oTCAGCGATA 


GACAGGTCTG AGATTTCTCC 


AATATGTGGC 


CCTTGCTGGC 


CGCCCACACG 


3420 


GACAGACAAG 


GCCTGGTCAT TGAGACGATA 


GCCTTGACAG 


GTTCCGCAGG 


TCAGCTCATT 


3480 


CATGTAGAGA 


CGCATCTGAG TGCGAGTGTA 


ATCGCTATTG 


GTTTCATGGT 


AACGACGTTT 


3540 


GATATTATTG 


ATAACTCCCT CAAACGGAAT 


GTCGATATCG 


CGCACGCCAC 


CAAATTCATT 


3600 


CTCATAGTGG 


AAATGGAATT CCTTACCATC 


TGACCCATAG 


AGAATCAAGT 


TCTTATCTTC 


3660 


TTCTGACAGG 


TCCTCAAAAG GCTTATCCAT 


AGCCACTCCA 


AAGACTTTCA 


TGGCCTGCTC 


3720 


TAACATGTTT 


GGATAGTAGT TGGATGAGAT 


AGGATTCCAA 


GGTGCTAGCG 


CTCCCTCACG 


3780 


TAAGGTTTTG 


CTAGCATCTG GCACTACCAA ATCAGTATCC ACCTCCAGCT TGATGCCCAA 


3840 


GCCGTCACAC 


TCACTAGAAG AGCCAAAAGG 


AGCATTGAAA 


GAAAAGAGAC 


GAGGCTCTAA 


3900 


CTCTGGGACA 


GTAAAACCAC AAACTGGACA 


GGCATAATGC 


TCAGAGAACA 


ACAACTCCGA 


3960 


GTCGTCCATG 


GTGTCGATAA TGACATAACC 


TTCTGCAATA 


CGAAGGGCAG 


CCTCAATGGA 


4020 


ATCAAAGAGA 


CGACTACGAA TGCCCTCCTT 


GATAACAATA 


CGGTCAACCA 


CGACATCGAT 


4080 


ATTGTGTTGC 


TTGCTCTTAG ACAACTCTGG 


CACTTCGGTC 


ACATCATAGA 


CTTCCCCATC 


4140 


CACACGGACA 


CGAACATACC CGTCTTTCTG 


AACCTTCTCG 


ATAACACTCT 


TATGTTGGCC 


4200 


TTTTTTCTTG 


CGGATGACAG GAGCCAAGAT 


CTGCAAGCGC 


TGGCGTTCAG 


GTAACTCCAA 


4260 


AACCTTATCA 


ACGATTTGCT CCACAGAAGA 


AGCATTGATA 


GCTCCATGTC 


CGTTGATACA 


4320 


GTAAGGCGTC 


CCCACACGTG CGTAGAGGAG 


ACGCAGATAG 


TCATTGATTT 


CAGTCGTCGT 


4380 


TCCCACCGTC 


GAGCGAGGAT TTTTACTAGT 


CGTTTTCTGG 


TCGATGGAAA 


TAGCTGGGCT 


4440 


GAGACCATCA 


ATGGCATCTA CATCTGGTTT 


TTCCATATTT 


CCCAAGAACT 


GACGAGCGTA 


4500 


GGCGGACAAA 


CTCTCTACAT AGCGACGTTG 


TCCCTCCGCA 


TAGAGAGTAT 


CAAAAGCCAG 


4560 


ACTGGACTTC 


CCTGAACCTG ACAAGCCAGT 


CACGACAACC 


AACTTGTCTC 


GCGGAATCTC 


4620 


CACATCAATA 


TTTTTTAAAT TATGGGCACG 


CGCCCCATGA 


ATGACAATTT 


TATCTTGCAT 


4680 


CTTTGTTCTT 


TCTAGTCCAT TATTGCTTAC 


CATTATACCA 


AAAAAAGTGA 


GATTCTATTA 


4740 


CCCAAAAGGC 


CGATTTTGTA GTATAATAGT 


ACAGTGTGAA 


AAAATCTGAA 


AAATGAGAAA 


4800 


GGATAAGGGA TATGAAACAA GTTTTTCTCT CTACAACAAC TGAATTTAAA GAGATCGATA 


4860 


CGCTTGAACC 


GGGTACTTGG ATCAATCTCG 


TCAATCCGAC 


TCAAAATGAA 


TCACTCGAAA 


4920 



WO 98/18931 



PCT/US97/19588 



523 



TCGCCAACAC 


CTTCGATATT 


GATATTGCTG 


ACCTTCGAGC 


ACCGCTCGAT 


GCGGAAGAAA 


49-80 


TGTCTCGTAT 


TACCATTGAA GACGAGTATA CCCTGATTAT 


CGTAGACGTG 


CCGGTCACGG 


5040 


AGGAAAGAAA 


TAACCGCACC 


TACTACGTAA 


CCATCCCGCT 


TGGTATTATC 


ATCACTGAGG 


5100 


AAACCATTAT 


CACTACGTGT 


TTGGAACCAC 


TACCTGTCCT 


TGATGTCTTT 


ATCAACCGTC 


5160 


GATTGCGTAA 


TTTCTATACC 


TTCATGCGTT 


CACGTTTTAT 


CTTTCAAATT 


CTTTATCGCA 


5220 


ATGCAGAGCT 


TTACCTAACA 


GCCCTTCGTT 


CAATCGACCG 


CAAGAGTGAA 


CAAATCGAAA 


5280 


GTCAACTGCA 


TCAATCAACT 


CGTAATGAAG 


AATTGATTGA 


GCTCATGGAA 


TTGGAAAAAA 


5340 


CTATCGTCTA 


TTTCAAGGCC 


TCCCTCAAAA 


CAAATGAGCG 


CGTGATTAAG 


AAATTGACCA 


5400 


GTTCAACCAG 


CAATATCAAG 


AAATACCTTG 


AGGACGAAGA 


CCTGCTTGAA 


GACACCCTGA 


5460 


TTGAAACCCA 


ACAGGCCATC 


GAGATGGCAG 


ATATTTATGG 


AAACGTCTTG 


CATTCTATGA 


5520 


CAGAGACCTT 


TGCCTCTATC ATTTCTAACA ACCAGAACAA 


CATCATGAAA 


ACCTTGGCCC 


5580 


TTGTGACCAT 


CGTCATGTCC 


ATCCCAACCA 


TGGTCTTTTC 


TGCCTACGGG 


ATGAACTTTA 


5640 


AGGATAATGA 


AATCCCCCTA 


AACGGAGAGC 


CAAATGCCTT 


CTGGTTAATC 


GTCTTTATCG 


5700 


CCTTTGCTAT 


GAGTGTCTCG 


CTCACTCTCT 


ATCTCATCCA 


TAAAAAATGG 


TTCTAAGAGG 


5760 


AGTTCCTATG 


TCTCAAATTG 


ATCTACAAAA 


ATtAACTAAG 


AAAAACCAAG 


AGTTTGTCCA 


5820 


CATTGCTACC 


CAACAATTCA 


TCAAAGATGG 


GAAAACAGAC 


GCTGAAATCC 


AGACTATTTT 


5880 


TGAGGAAGTC 


ATTCCCCAAA 


TCCTTGAGGA 


GCAATCTAAA 


GGTACAACTG 


CCCGTTCCCT 


5940 


ATACGGCGCA 


CCAACTCATT 


GGGCTCATAG 


CTTCACTGTC 


AAAGAGCAGT 


ACGAAAAAGA 


6000 


GCATCCAAAA 


GAAAATGATG 


ACGCAAAACT 


GATGATTATG 


GACTCAGCTC 


TTTTCATCAC 


6060 


TAGCCTCTTT 


GCCCTTGTCA 


GCGCCCTCAC 


AACCTTCTTT 


GCGGCAGACC 


AAGCTTTCGG 


6120 


CTATGGATTG 


ATTACTCTTC 


TATTAGTTGG 


ACTGGTTGGT 


GGATTTGCCT 


TCTACTTGAT 


6180 


GTACTACTTT 


GTTTACCAAT 


ACTATGGACC 


AGATATGGAT 


CGCAGTCAAC 


GTCCACCTTT 


6240 


CTGGAAATCT 


GTACTAGTTA 


TCCTAGCTTC 


TATGTTCCTT 


TGGTTGCTTG 


TCTTCTTTGC 


6300 


AACAAGCTTC 


CTACCAGCTA 


GCCTTAACCC 


AGTACTGGAT 


CCATTGCCAC 


TAGCTATTAT 


6360 


TGGAGCAGCC 


CTCCTAGCCC 


TTCGCTTCTA 


TCTCAAGAAA 


CGCTTGAATA 


TCCGTAGTGC 


6420 


AAGTGCAGGA 


CCAACACGCT 


ATCAAGAATA 


AGAAAACGAT 


AAAAGCAACT 


GCAGGTGCGG 


6480 


TTGCTTTTTC 


ACTTACTTTT 


TTGAGTTATA TTCAATGAAA 


ATCAAAGAGC 


AAACTAGGAA 


6540 


GCTAGCTGCA 


GGTTGCTCAA 


AGCACAGCTT 


TGAGGTTGCA 


GATAAAACTG 


ACGTGGTTTG 


6600 


AAGAGATTTT 


CGAAGAGTAT 


TAAAAGTATT 


CTTCTGAAAT 


CCCACATAGC 


TTTCTCTTAT 


6660 
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ATTTTGTGAT AAAATAGGCT CAATCTATTT CTAGGAGGAT GAGATATGGT TTCTACTATT 
GGTATTGTTA GTTTATCTAG TGGCATTATC GGAGAGGATT TTGTCAAACA CGAAGTGGAC 
TTGGGTATCC AACGTCTCAA GGATCTGGGA CTCAATCCCA TCTTTTT 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11864 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



6720 
6780 
6827 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
CTGGCTAGTT GCATAGAGCA AAGTTGCTTC TTCATCAACA AAACCGTTCA TTTCAAAATA 
GGAAAGCAGC TCATCAGGAC TCTCCAAACG AATCCCTTTG TAATCCAGCT CAACTGCCAC 
CTCTTTCAAG GCTGCAAGAA GAAGTGTTCC CAGGCCCTGT CTCTGATGGT CAAACTCGAT 
GACTAAAGAA TGTACTTTTA GACATTGCGG ATTGTCTGAC TGGGGACTTG ATAAAATATA 
GCCTAAAAGT TGATTTTCAT CCCTAGCTAG AAGAAAGGTA TCCGCACACT TACGGATACT 
TTCTTCTAAA ATATGGGAAA GTTGCTGCTT TTCAGCTGGA AAAGACGAGG TCTGAAGTGC 
CCCTATCTCA GGCAAATCAG ACTTGCTTGC CTGAATGATC TTAATTGGAA TTTCCATGGG 
AACATCCTAT TGAACATTGC TTGTCAAGTT AGACAAGAGA CGCTCAAATG AGTATTCATA 
GGTTTGGATG TCTCCTGCTC CCATAAAGAC GTAAACAGCA TTGTCATGGT CTAGGAGTGG 
AGAAACATTT TCAACAGTAA TCACTTGGTG TTTTTTGTTG ATTTTGTTGG CTAGGTCTTC 
TACCTTAACG TCACCATGAT CTACTTCACG AGCCGAGCCA TAAATTTGCG CTAGATAAAC 
AGCATCTGCT TGGTTTAAAG CATGGGCAAA GTCGTCCAAC AAGGCAATGG TTCTTGTAAA 
GGTATGCGGT TGAAAGACTG CTACAATTTC CTTGCTTGGG TATTTCTGAC GAGCCGCATC 
CAAGGTCGCA ATAATTTCTG TTGGATGGTG GGCAAAGTCA TCGATAATCA CTGTATCATT 
GACAATTTTC TCAGTGAAAC GACGTTTAAC ACCGGCAAAT GTTTTCAAGT GCTCACGCAC 
CAAGTTCAAA TCAAATCCTG CTGTGTAAAG AAGACCAATA ACGGCTGTCG CATTCATGAT 
ATTGTGACGA CCAAAGGTTG GAATGTGGAA TTGCCCCAAG TTTTGTCCAC GGAAATGAAC 
GGTGAAGGTT GAACCAGTTA TTGAACGAAG AAGATCACTA GCTACAAAGT CATTGCCTTC 
AGCTTCAAAA CCATAATAAT AAATTGGTGC ATCAGACGTA ATCTTACGCA ATTCAGCATC 
TTCACCATAG ACAAAAAGAC CCTTGGTGAT TTGTTTGGCA TAGTCGTTAA AGGCATTAAA 
AACATCCTCG AGACTTGTGA AATAATCTGG ATGGTCAAAG TCAATGTTGG TGATAATAGA 



60 
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180 
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300 
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GTATTCTGGG 


TGGTAAGGCA TGAAGTGACG CTCATATTCG TCAGATTCAA AGACAAAATA 


1320 


TTTGGCATTG 


GCCGAACCAC GACCTGTCCC ATCTCCAATC AAGAAGCTGG TATCTGTAAT 


1380 


GTGAGACAAG 


ACATGAGACA ACATACCTGT CGTTGAAGTT TTTCCATGTG CTCCTGCTAC 


1440 


TCCCATGCTA 


ACAAAGTCAC GCATAAAGCT ACCTAGAAAC TCATGGTAAC GTTTGTAGCT 


1500 


GATACCATTT 


TGGTCCGCAT AGGCAATTTC GACGTTGTTA TCTGGACGAA AGGCATTTCC 


1560 


AGCGATAATT 


TCCATATCAC CGTCTAGATT TTTTTCATCA AAAGGAAGAA TGGTAATTCC 


1620 


TGCCTGCTCA 


AGACCGCGTT GGGTAAAGTA GTACTTTTCA ACATCTGATC CCTGAACCTT 


1680 


GTGCCCCATC 


TGGTGCAACA TCAAGGCCAA GGCACTCATC CCTGATCCCT TAATTCCGAT 


1740 


AAAATGATAT 


GTCTTTGACA TGTTTTCTCC CCTATTCTGT CATTCTGGTC AGATTCAACT 


1800 


CTTGGGCAAC 


CCGACGTTCT TGTTCTGTTT GTTTACTTTT TTTATTGTAG ATTTGGCTCT 


1860 


TCTTTAGAAA 


ATCATAATTG TTTTTCTTTG GAGCAGGTGC TGACACTTCT TCATTCTTGG 


1920 


TAGGGATAGA 


ATGAACTTCT TCCGCCAAGA TATAATGAGA CTGGGTCAAT TTTTGGCTAT 


1980 


ATTTGACAAA 


TTCACCAGGA TTTTCCTTTT GGAAAGGAGC TGTCGGTTGA TTGCCCTGTC 


2040 


TAACTAGACT 


GGGCTGAGAA TGACGTCTCG CAAGGCTGAA ATCCTGAGTT AGGTAGTTAG 


2100 


CAGAGCGTTT 


CTTTTTCAAG TCCGCACGCG CTTCTTCACG CGCCACCTCC GCATAGCTCT 


2160 


TTCCTTCTTT 


TTTAACCCCT AAAGGAGCCT TTTTAGGTTT TTCGACTTGC TTTTCAATCG 


2220 


GTTTTACTGG 


TTTTTCTTCA GCAATAGGAG CCCATTCTAA ATAATTTTTA TCTCGATACT 


2280 


CACCCTTGAT 


ATTACTGATC AGATCAGACT CATCATAGAG ATTCATGAGT GGCATTTCAG 


2340 


TCAACATGAC 


CTCGTCATCT GACACCAATG GAAATCGTTC TTGTTTCATT TTCTATTTCC 


2400 


TTTCAACACT 


TCATTATAGC GTATTGTCTT GATTTTTCAA GTGCTGGCTT CAGAAATTCC 


2460 


CAAAATTTCT 


CTAATTTCTG CTAGGGTCAG ACTACCACGT GACTCTGTGC CGTCCAATAC 


2520 


TTGTGACACC 


AGATGTTTCT TTTGTTCTTG GAGTTCCTGA ATTTTTTCTT CAATGGTTCC 


2580 


CTTGGTCACC 


AAGCGATAGA CCTCAACCGT TTCTTCCTGA CCCATCCGAT GGGCACGGCC 


2640 


AATGGCTTGC 


GCTTCCACCG CAGGATTCCA CCAAAGGTCA ACCAAGATCA CTGTATCTGC 


2700 


ACCTGTCAGG 


TTCAGACCGA CCCCACCAGC CTTGAGGGAA ATCAGAAAGG CATCTCTTTC 


2760 


TCCTTGGTTA 


AAGGCCTTGG TCATGTCTTG TCTTTCCTTG GCTGGGGTTG AACCCGTAAT 


2820 


TTTAAAGGAA GTCAGGCCCA AGTCTGGCAG TTCTTGTTCA ATTTTTTCCA ACATTCCCTT 


2880 


GAACTGAGAG AAAATCAAGA CACGGTGTCC GCCGTCTGCC ACCTGTACCA GTAGGTCTCG 


2940 


GAGACTATCT 


AGTTTGCCGC TGGCTCCCTG ATAATCTTCC ATAAACAGGG CAGGAGTGTC 


3000 
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ACATATTTGA CGCAAGCGCA TCAAACCAGA TAAAATTTCC ACACGACTTC GCTGAAATTC 3060 

CTGTTCTGAC ACTTGAGCCA GATGGTCTCG CATCTGTTGT AACTGGGCAA GGTAAATAGC 3120 

CTTTTGCTGG TCTTCCAGTT CATTTTTATA AACCACCTCA ATCAAGTCTG GCAATTCAGT 3180 

CAGAACTTCT TCTTTCTTGC GTCGCATCAC GAAAGGCTTG ATAAACTGAG CCACTCGCTC 3240 

TGCTGGCAAT TTCATAAATT CTTTCTTGCT TGGCAAAAGT CCAGGCATGA CGATTTGGAA 3300 

AATAGACCAC AACTCACCCA GATGGTTTTC AATCGGAGTT CCTGACAAGG CAAAGACCGA 3360 

CGGCACCACA AATTGTCTCA AGGTCTGGGC AATCTTGGTC TGGGCATTTT TCATGACCTG 3420 

AGCCTCATCT AAGAAAAGGA AGTCAAAGGC CATCCCTTGA TAAAACTCAC TGTCCTGACG 3480 

GAAGGTGGCA TAGCTAGTCA CATAGATTTG ATGGCTCTCG GCAAGAATCT CCTCACGACT 3540 

TGCTTTCAAA CCATGAACAA CAGTCACATC CAACTGTGGA GCAAATTTCT GAAACTCATC 3600 

TGCCCAGTTG TAAATCAAAC CCGACGGAGC GAGAATCAAA ACCCGAGTTT CTTTTGTCAC 3660 

TTGACTAGTC AAAAAAGCAA TGGTCTGAAG GGTTTTCCCA AGTCCCATAT CATCAGCCAA 3720 

AATCCCACCA AAACCATAAT GATGGAGCAT CTGCAACCAG CCAATTCCCT TTTCCTGATA 3780 

ATCTCGCAAG TCAGCCTTGA CCTGAGTTGC TTGCAAAGGA AAGTCCTCTG GATGCGTCAA 3840 

ATCCTGGGCC AGATTCTGGA ATTCTTGTGA AAAAGAAACA CGGTCTCGCC CTTCAAAGAG 3900 

ATGAGCTAAA CTGTAGGCCA AGGATTTCCG AGCCTGCAAG GTCCCATCTT TTAATTCAAA 3960 

TTGCCCCAGT TCCTGTAGAT TTTGGCGAAT TTTCTTGGTT TCTTCATCGA AAAAGTAAAC 4020 

TTGATTAGAC GAATCAATAT AAAAATCCTG ATTGGCAACC AAGGCCTGCA TGGCTTGGTC 4080 

GATTTCCTCC TGGACAATAT TTTGAAAATC AAACTGGATT TCCAAGAGAC CTCCCTTGGA 4140 

GGCAATCTGC ACCTGAGGAC TCGCTAGGCT ATAAAGCTCT TCTAGTTTAT CTGATAGGTC 4200 

AACATGCCCG AGTTTTTCAA AGACTGGAAT GATATCATGA AAAAAATGAT AGACAGACTC 4260 

CGCTTTTAAG GCCTGACGCC AAGATTGAAA ATCGGCCTCA AAGCCCGCAG CCAAACAGAC 4320 

TTGGAAAATT CTTTCTTCTA AGTCTGCGTC ACTTGAAAAG GGTAATTCTT CTAGCTCTTG 4380 

TCGGCTAGAT ACCTGTCTAT TTCCATAATC AAACTGAATT TCTAAACGAA TCCGATTATC 4440 

TTCTTCCCTG TCAAAGTAAA AAGAGGGCGC AAAAGTTTTG ATTTGTAGAC GTTCTGGAGC 4500 

TGAAACGGTG CCCATCTGGA TAAAAAGAGT CAGACAGGAG GCCAATTTGT CTCGATCACT 4560 

GCTATCAAAT TGCAGGTATT TCTTTCCTTG TTGACCCACA GGTAACGCTT TAATTTCCTT 4620 

GAGAAGACGC ATCTGCTGGT CTGTTAAAAA ATAAACCTGA CCTTTATGGA AAAGTACTGC 4680 

TCCCTGATAA AAGACATTGA CCCTAGGACT CTCACTGATT TCCATTTCAA AATAATCCGA 4740 

GTATTCTGTT ACTGTAAAGG CAAATAGATT GGCATCAGCA TGCATATCCT GAAAAAGCAG 4800 
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GGTTTGGTAG 


CTATCCACTT GATGGTCAAA 


TTGAAAATGG 


GGCAAGGCCA TCAGTAAATT 


4860 


CACACCCTGC 


TCAAAAAAGG TCAGAGGGAA 


AAAGAGGTGC 


CGACCTTGGT TTTGGAAAAA 


4920 


GAGGTCTGGA ACCAGCCCTT CCTCCGTTAG TCCGTGCAAG AAAGTCAAAA GTTCTTGGCT 


4980 


GGCATCATCA 


AAGGCTTCCC AAGAAAGAGA 


CTCCTCATAA 


ATCTTGCCAA TCATATACGA 


5040 


CTTTCTCTGC 


TCGACAATCC TTAAAAAAAG 


TGGAATATCG 


CGAATGACAT AGTATTTTTG 


5100 


GCTATTGATT 


TGGCCGATTC TCAGAGTCCA 


CAAGATATGA 


TTGGTTCCTG CTTCCACCTG 


5160 


ACCCACAGCT 


GATAACTCAT AGGCGCATTG 


TGATTTTGGA 


GATAAAATTC GATCCAAAAA 


5220 


CTTGCCACCC AAGGTCACCT TGGTTTCAAC AGCCTCTTTT TCTTCATGAC CTTCTTCCAG 


5280 


ACTCCACAAG 


ATTTCCTGAC CAC6CTCATC 


ATTTTTCAGA 


AAATGCTCTA GCGCTGCCAA 


5340 


ATGCACACAG 


TAGCCCCTCT TTTGAAAAAA ATCACAGGCA CAAAAAACCA AATCATCCTC 


5400 


TAAACTATAG 


CGCAGTTCTT CTTCTGCAAC 


GCGAGCGTAG 


AGCCGATTGT TCTTTTCCTT 


5460 


GATGATATCA 


ACCTTACCAG TTTCATAAAG 


GGCAACACCT 


TCGATACGAA TTTTGCCCGG 


5520 


AATCAATTTA 


GCCATATTTT CACCTTTACC 


TTATCTTTTT 


ATTATACCAT ATTTTCGCCT 


5580 


ATGAAAATAG 


CCTTCTAGGA AGACTTTTCT 


CCTAGAAGGC 


TGGATTTTTA ACGTTTGGCA 


5640 


AAAGTAGCCA 


CAATCCGCTG ACAGACTTCT 


TGCAACAGAG 


ATTTGGGCAT AGCTATATTG 


5700 


ATGCGGGCAT 


GGAGACTTCC TTCCTCTCCA 


AAATCCAAAC 


CACGGTTGAG GATAACCTTG 


5760 


GCTTCATTTC 


TCAACAACTC TTGCAATGTT 


TCATCAGTCA 


GGTGATAAGC TGAAAAGTCA 


5820 


AGCCAAATCA 


AGTAGGTACC TTGCGGTTTC 


ATGACCTTGA 


TTTTAGTCTC TTTTCCAAAT 


5880 


AGATCCATCA 


CATAATTGAT GTGGTCTTCA 


AAGACTTGCT 


TGAGTTCCTC TAGCCAATCT 


5940 


TTACCGTATC 


GATAGGCAGC TTCTGTCGCC 


AAATAACCCA 


AGCCTGAAAT TTCATGCTGA 


6000 


TTATTGGCCA 


ACAGGCGTTT CTGGAAAGCC 


AGTCTCAACT 


TAGGATTTTC AATGACTGCA 


6060 


TAGGAATTTT 


TTGTTCCAGC AATATTAAAT 


GTTTTAGTGG 


CACTGCTGAA GACGATAGCA 


6120 


AAATTTTTGA AGGCAGGATT GATGGTATTG AAAGAGTGGT 


GTTTGTGACG AAAGAGGGTC 


6180 


AAATCTTGGT GAATCTCATC CGAAACTAAC AAAACACCGT GTTTTTGGCA GAGTTGGCCA 


6240 


ATCTTCTCCA ACACTTCTTT TTCCCAAACA CGTCCACCAG GATTGTGAGG GTTGCAAAGA 


6300 


ACATAGAGTT TAACCTCCTC TTCCACCAAA TCCTTTTCAA GTTGGTCAAA GTCAATCTCA 


6360 


AACAGACTAT 


CCTTTTCCAC TAAGGAATTA 


GTAATCAATC 


TACGATTATT CAACTTGACA 


6420 


CTGCGAGCAA AGGGTGGGTA GACAGGCGTG TTAATTAAAA CCGCGTCGCC TTCTTTTGTA 


6480 


AAGGTTTGAA TAGCTGTTGA GATGGCTGGT ACCACACCCT CGATAAAGAC AAGAGCCTCT 


6540 



1 



WO 98/18931 



PCT/US97/19588 



528 

TTGTCAAAGT TGTAACCGTA TTGTGTAGCT TCCCACTTTT GAACTTCCTT AATTAAGTCT 6600 

TCACTGGCAT AGGTATAACC ATAAACCAGT TGGTCTGCGT AAGTTTGCAC GGCTTGGCGG 6660 

ATTTCAGGCA AGACCACAAA GTCCATATCC GCTATCCAAG CTGGTAGAAC TTCACTATCC 6720 

GTTTCTGTTT CTTTCCATTT ATAGGTATGG TGCCCTAAAC GGTTGGGCAG GCTTGTAAAA 6780 

TCATATTTTC CCATCTTTGT CTTATCCTTC TATGGCTTGG CGCAAATCTG CAATCAAATC 6840 

TCTAGCATCC TCAATCCCAA TAGACAAACG CAAGAGGTCA TCTGTCAAAC CATAAGAATG 6900 

GCGTACCTCT GCTGGAATAT CAGCATGAGT TTGAGTCGTT GGATAAGTAA TAAGACTTTC 6960 

CACTCCACCC AAACTTTCCG CAAAAGAGAA GACCTTGAGA CTGTTCAAAA TATGAGGAAT 7020 

GCGTGTTTCA TCGGCTACTT TAAAGGAAAT CATGCCTCCA CGACCAGTGT AGAGAACTTC 7080 

CTTAACTGCT GGAGAATCCT TCAAAAAGGC AACCACTTCT TGGGCGTTAG CTGTTGAGCG 7140 

CTCCATACGA AGAGACAAGG TCTTGAGACC ACGAAGCAAC TGGTAGCTGT CAAATGGAGA 7200 

CAAGACTGCC CCTGTTGTAT TAAGATTGTA AAAAAGCTTC TCGTATAGTT CTAAACTATT 7260 

GGTCACAACC ACTCCAGCCA AGACATCATT GTGGCCTGCT AGATACTTGG TTGCTGAATG 7320 

GAGAACGATA TCTGCTCCAT CTTGAATCGG ACGTTGGTAG ATAGGGCTAT AGAAGGTATT 7380 

GTCCACCACC ACTTTGGCAC CCTTAGCATG AGCCAATTTT GCTAGTTTTT CGATATCAAA 7440 

TTCCAACATC AAGGGATTGG TTGGGGTTTC GATATAGAGA ACATCCACAT CCTTTTCTAA 7500 

CTCGGCAATC AACTCTTCTT CTGTATTGGC ATAGGTAAAA TGGAAATGAC CTTCCTGCTC 7560 

CACTTGGTTA AACCAGCGAA AAGAACCACC GTAAAGATCA CGCACTGCCA AGACCTTACT 7620 

TCCTACTGGA AAGACGCTAA AGGCCAGTAC AATAGCTGAC ATCCCTGAGC TAGTCGCTAG 7680 

GGCATAGTCT GCTGACTCAA TAGCCGCCAA GACTTCCTCA GCCTTACTAC GAGTTGGATT 7740 

TTTAGTGCGC GTATAGTCAA ACCCAGTAGA TCGACCAAAC TCTGGATGCT GATAGGTCGT 7800 

TGAAAAATGA AGTGGTGTCA CCAAAGCACC TGTTGCCTCA TCAGACTTGA TCCCTGCTTG 7860 

TGCTAAAATT GTGTTAATGT GTAATTCCTT GCTCATACAA TTCCTCCAAA TCTATAGTAA 7920 

CTATTGTACC ACTTATTTTG TATCCTTCGT TTTCTTGTTT TCAAGAGCTA GTTATAGTTT 7980 

CAAACTATAT AAAAAGGGAG TTTTTCCTGC TCCCTTTAAT AGACTATAAA ATGGTGAATC 8040 

TCAAAAGACA CCTTCACTCT ATCATTTGCT CCTGCACAAA ACGAGCATAA CGCTCATGAT 8100 

TTTCCAGTAG TTCCTTATGA GTTCCTGAGC CAGTGATTTT CCCCTCCTCT AAGAAGAAAA 8160 

TACAATCCAC ATCTTTTACC GTTGACAAAC GATGCGCTAT AATCACAACC GTCTTCTCCT 8220 

TTAGTACAGA ATAGAGGCTA CTGATAATCG CATACTCAGA ATCCGCATCA AGATTAGCAG 8280 

TGGCTTCATC AAATATAAGA ATTTCAGCAT CTTTTAAGTA GGCTCTAGCT ATTTGAAGTC 8340 
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TTTCGTTCGC CCCCCTGACA AGAGTCGTCC GCGTTCACCA ACTTCAGTAT CTAGTCCCTC 8400 

TTTCATGGAG CGAATCTCAT CACCTAGTGA TACTAAGTCT AGCACTTTCA TCAATTCATC 8460 

ATCAGTTACT AAGCGATTCA AACCGAGACA AAGATTGTCA CGAATACTGC CAGATAAGAC 8520 

TGCATTATTT TGTGAAACCC AAGCGATTTT ACTTCTCCAT TCTTTTAAGT TAAAATCATA 8580 

TATACTTGAT TGCTCCATTA GAATATCTCC TGAAAGCGGT TTATAAAACC GCTCTAACAA 8640 

ACGCACAATC GTTGATTTTC CTGATCCAGA TGGTCCAACA AAAGCAATTT TTTGCCCCTT' 8700 

GAAAATTGAA CAAGTAATAT CCTTTAAGAC AGGTCGATTT TCATCATAAC CAAAATAGAC 8760 

ATGGTTAAAA TTCAACCCTC GTCCTGATAC CGATTTTCCT CCCTCAAATT TTTCTTTAGG 8820 

AACTGCAAGC AAGTTCTCCA GTGCAACTGA AGATCCCTTG CTCCTAGAAT AAACAGTTAC 8880 

AAAATTAGCT ATATTACTAA TAGGATTAAG TAATTGAAAG AGGTAAATCA AAAACGAAAC 8940 

CAAGGTTCCC ACAGATATAT ATCCTGCGCT GACCCGATAA CCCCCATAGG TTAGCATCAC 9000 

AGCTATAGTC GCAAAGATAA ATAAGAGAGC AAACGGGGTC TCAAAAGAAG TAACCCTATC 9060 

TGATTTCAGT GAATTGTTTT GTACCCTTTC AATACAATTA TCCAAAACAT CCTGTACACT 9120 

TTTCTCTGCT TGGTTAGTCT TAATTAATTC ATGTTCTTGA ATCTTTTCAG TCAATTGCCC 9180 

TGTTAAATTT CCTCCTGTAA ACGACGACTA TACTTTTCAC TGATATTGGA AAGGGGCAAG 9240 

ATAATAAACA TCATACAAGG AAGAGTGATG AATAAAAGTA GAGAAAGATT CCAATCAAGA 9300 

CTAAATAAGA CTACAATGGA ACCAAGTACC ATAACTAAAC TCAGAATAAT ATTTGGGAAA 9360 

GTCGTAATTA AAAACTCACG AATGACACTC GTGTCATTGA CAATGGCAGA AGTCAACTCC 9420 

CCACTTTGGC TCTTATCAAA GAAGGATTTC TCTACATAAA TCAACCCCTC TATCACTTTT 9480 

TTCCTGATTT TTGCTATCTT TTTTTCACCC GATTGACTAA ACAGATAGTA ACCAATAGAA 9540 

GAAAACAAGG CTTGACCAAT AAAAATCAAA AACGATTGAA ATACTTTGGA GCCTATATTT 9600 

TCAATAGAAC TCCCATCTAT TAAATCCTTT AAGATAAGGG GAAGCAACAA AGCAAGTAGA 9660 

CTAGACAGAA CAAGTAAGAA ACTCCCCATA ATCACCTTAG TATCTACTCT TAATAATTTT 9720 

AATTTCATAA ATACTCCTTA TAATATTTCA ACGGATAAAG TCGGGAATAA CTCAATTTGA 9780 

GGATAAAATC TAATAAATCT TCCTATAACA AAACGCATAA CATCTAGGAT TTTATATACC 9840 

TGATATTATG CGTTTTTAAG CACAAAGACT TCTTACACAA ACTTATCTAC AATTAGATTT 9900 

TATTTGACAT GTTTTGCCAA TTCTTCTTGG GCTTTTTTAT TGGATTCTTC TTTTTCTTTC 9960 

AACCATTTTT CTCTGGCTTT TGCATATTCG TCTGTTGTGA CAATCTTATC TTGTACTTTG 10020 

AGGTATTTAT ATGATTCAAC CCCTTTTGTA CCGGTTAAAC CATAGGCAGC AGCAAATGGT 10080 
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ACGGTTCTTC TCAATGATGG TGTTCPrrTA PfVr.AftAPJir mvvaaraar 


1 AAAUAAC 1 A 


10140 


TCAATCAACC AAGCTTGAAT ATT'Af^r'ATA'P ttpwuti ar /""prnvrir'fw 


ATCTTGCTCT 


10200 


TTATTAGCTT CTTCCAAPAT TTfilftTlTlfi a r* A TV-r- a rrr- rAsrwr^mn 


AGCCTTGTCA 


10260 




AGTATTAAAA 


10320 


n j. n x v.unv>n i *vjVj 1 1 V*rt.L.\aO Lj 1 \_ 1 1 VsA i AA iA_ AGGTCC CC AACCGCCA TG 


ATATAAATCA 


10380 


* x * * a \_ i lo l i luAbCAAAu 1 A(jt_t. iuAAL TGTGAAACTC 


ATCTGATGTT 


10440 


AAT"W3fTG A A TYSTVA ATV^i/" 1 TAr*n'nm]tnir>K r* H H /"tv*im » » » n nm-tumirn-tn % m 

™-» * j. vjv. 1 itj ru/wt I AL.A1 1 A 1 (JA (aAACCT AAAA CAGATTCAAT 


TGATTGTTTG 


10500 


ATAGAACTAA CTCCTTGTAT GCCTACTTTA TCTGTTACTT CCACAGTCTT 


ATCCAAGTGG 


10560 


ATTGGGAATT GAACACCCTT TGCTTCGAGT TCTTTCTTAG CTTCCGCAAA 


CTTAGCCTTG 


10620 


(jv- 1 1 a (_ i^Atj laAi l\a L AkTA AGGGTCTTGA CCATCCGCAA AGTTGATACC 


TTGCCATTCC 


10680 


TTACCATAGT TGACCATCTT AGAGGCTACA ACTTCACCAA AGTCTTTTCC 


CTTGATACTG 


10740 


ACAAAGTTTG GAGGAACCAC TAGGTTACGC AAAATCTTTG TTGCACCTTC 


TTTCCCTTCA 


10800 


GACTGAGCCC CATAAGATGT TCTGTCAAAA GCAAAATTGA TAGCCTGACG 


GAAGTTTTTA 


10860 


TTGAGAACTG CTTCCTGAGT CGATTTCTTT TCAATGTCAC TTGTTTTAGA 


AGTATAATTG 


10920 


TAAGACTTCC TATCTAGGTT AAAATTAAAG AAATATGAAG TTGAATTTTG 


CATACTATAG 


10980 


ATGATATTGT TTTTGTATTT TTCTTTAATC CCTTCATAGC TGGAGCTGTT 


AGGAAAAAGA 


11040 


CGAGCCGTAG TATAAGCACC AGCTGTAAAA TTACGTTCCA GTGATTCTTG 


GTCGCTACCA 


11100 


TCATAGTAGG TCAATTTCAC ATCGTCTACA AAGACATTCT TAGCATCCCA 


GTAATTAGGG 


11160 


TTTTTCTTAT ATTCAATAGC AGATTTTGAG ACAAGTGCTT TCATCAAGAA 


AGGTCCATTG 


11220 


TACAAAATAC TAGATGGATC CGCCTTCCCA AAATCATCCC CTTTTGATTT 


CAGGAAATCT 


11280 


GCATTAACAG GAAAAAGTAT CGTTGCAAGT GTTTTTGAAT TCCAGTAAAG 


TTCTGGTTTA 


11340 


ACCAAAGTAT ATTGAACCGT TTGGTCATCA AGTGCCTTGA CACCGACAGT 


TGAAAAGTCG 


11400 


CTTGTTTTAC CAGTGATATA GTCATCCAAA CCAGCAACAG AGTCCTGCAC 


TAGATACAAG 


11460 


GCTTCTGATT TTTTATCAGC TGCATATTGC AAACCTGTCA CAAAATCCTG 


GGCAGTTACA 


11520 


GGCGCATATT CTTCTCCCTC AGAAGTAAAC CACTTGGCAT CCTTACGAAG 


TTTGTAGGTA 


11580 


TAGGTCAAAC CGTCCTGAGA AACAGTCCAA TCCTCTGCTA ATGATGGAAT AATATTCCCA 


11640 


TATTGGTCAT TTTCTAATAA CCCGTCTACC AAATTTGCAA CAATATCGGA 


TGTTGCTGCG 


11700 


CGGTTTTCTG CTAGATAGTT CAAGCTAGAT GGATCACTTG AATAAACATA 


GTTGTAGGTT 


11760 


TTTGACGCCG TGCTAGAATT TCCACACGCG CTCAATAAAA CTCCTGTACC 


CAGGACAAGA 


11820 


CCTGCCAAGG TTAGATATTT GCTCTTAGAC TTTTTCATTT CCGG 




11864 
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(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2412 base' pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 



TAACTGCACT 


AAACATAATA 


TAAGGAGAGA 


AAATGTCTGC 


AATAGAACGT 


ATTACAAAAG 


60 


CTGCTCACTT 


AATTGATATG 


AACGATATTA 


TCCGTGAAGG 


GAATCCTACT 


CTACGCGCGA 


120 


TTGCTGAGG A r AGTC ACTTTC 


CCCCTATCTG 


ACCAGGAAAT 


CATCCTAGGC 


GAAAAGATGA 


180 


TGCAATTCCT 


TAAACATTCC 


CAAGATCCTG 


TCATGGCTGA 


AAAAATGGGA 


CTCCGCGGTG 


240 


GTGTTGGACT 


GGCTGCTCCC 


CAGTTAGATA 


TCTCAAAACG 


CATTATCGCT 


GTTTTGGTAC 


300 


CTAATATTGT 


TGAAGAAGGC 


GAAACTCCAC 


AGGAAGCCTA 


CGATTTGGAA 


GCCATTATGT 


360 


ACAATCCAAA 


AATCGTCTCT 


CACTCTGTTC 


AAGATGCTGC 


TCTTGGCGAA 


GGAGAAGGTT 


420 


GCCTGTCTGT 


TGACCGTAAC 


GTGCCTGGCT 


ATGTTGTTCG 


CCATGCCCGC 


GTTACTGTTG 


480 


ACTACTTTGA 


CAAAGATGGA 


GAAAAACACC 


GTATCAAACT 


CAAAGGCTAC 


AACTCCATTG 


540 


TTGTTCAGCA 


TGAAATTGAC 


CACATTAACG 


GTATCATGTT 


TTACGATCGC 


ATCAATGAAA 


600 


AAGACCCATT 


TGCAGTTAAA 


GATGGTTT AC 


TGATTCTTGA 


ATAAAGAAAA 


TCCCGTTGCA 


660 


AGACGGGGTT 


TTGTGTTATA 


ATAGAGGCAT 


GAAAACAAAT 


GATATTGTCT 


ATGGTGTCCA 


720 


CGCCGTTACC 


GAAGCCCTCC 


TTGCAAATAC 


AGGAAACAAA 


CTCTACCTCC 


AAGAAGATCT 


780 


CCGAGGTAAG 


AATGTTGAGA 


AAGTCAAGGA 


ACTAGCTACA 


GAAAAGAAGG 


TGTCCATTTC 


840 


TTGGACATCA 


AAAAAATCTC 


TCTCTGAGAT 


TACTGAAGGT 


GCTGTTCATC 


AAGGTTTTGT 


900 


TCTACGAGTG 


TCTGAATTTG 


CCTATAGCGA 


GCTAGATTAC 


ATCCTTGCAA 


AAACACGCCA 


960 


AGAAGAAAAT 


CCACTTCTAT 


TGATTCTAGA 


TGGTCTAACC 


GATCCCCATA 


ATCTGGGTTC 


1020 


TATCTTGCGA 


ACAGCCGATG 


CGACCAATGT 


TTCAGGTGTC 


ATCATTCCCA 


AGCACCGTAC 


1080 


TGTCGGAGTA 


ACTCCTGTCG 


TTGCCAAAAC 


AGCCACAGGT 


GCTATTGAAC 


ACGTtCCAAT 


1140 


TGCCCGAGTG 


ACCAACCTCA GTCAAACCTT 


AGGATAAACT 


TAAGGATGAA 


GGTTTCTGGA 


1200 


CCTTTGGAAC 


GGATATGAAC 


GGTACTCCTT 


GCCACAAGTG 


GAATACAAAA 


GGGAAAATCG 


1260 


CCCTCATCAT 


TGGAAATGAA 


GGAAAAGGTA 


TCTCTAGCAA 


CATCAAAAAA 


CAGGTCGATG 


1320 


AAATGATTAC 


CATTCCGATG 


AATGGACATG 


TTCAAAGCCT 


TAATGCCAGT 


GTTGCTGCGG 


1380 
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CCATTCTCAT GTACGAAGTT TTCCGAAATA GACTATAAAA AAGTTTCCAG TCATCTGATT 1440 

GGAAACTTTT TTATGATTAA CTATGTTCTG TAATGAATTT ATAGGCTTCT TGACCAGCGA 1500 

TAGCTCCATC TCCAACCGCT GTTGTTACTT GGCGAAGGTC TTTCAAGCGA ACATCTCCAA 1560 

CTGCAAAGAT ACCGTCGACT GCAGTTTTCA TGTGGTTATC TGTCACAATC CATCCTGCCT 1620 

GATCTTGGAT ATTCAATTCT TTAACAAAAT CGCTAAGAGG GTCCAAACCA ACATAGATAA 1680 

AGACACCACC GAAGGCTTGT TCTGTCACTT GACCTGTTTT CACATTTTCA AATACGACTG 1740 

ATTCTACTCG GTTTTCACCC TTGATTTCCC TTACTACAGA ATCCCAGATA AAGCTGATTT 1800 

TTTCATTCGC AAAGGCGCGA TCTTGTAAAA CCTTTTGGGC ACGAAGTTGG TCACGACGGT I860 

GAACAATGGT AACAGTCTTA GCAAAACGAG TCAAGAAGAG GGCTTCTTCA ACAGCTGAAT 1920 

CTCCACCACC AACTACCAAT AAATCTTGGT CACGGAAGAA AGCACCATCA CACACAGCAC 1980 

AGTAAGAAAC ACCACGACTG TTCAGTTCTT CTTCTCCAGG CACTCCCAAA GGACGGTGTT 2040 

TAGAACCAGT TGCTACGATA ACTGTACGTG TTTCATATGT TTGGTCATCA GTCATCACTT 2100 

TCTTAAAATC ACCATGGCTT CGACATTTTC AACATAACCA TAAATGTGCT CAACACCAAG 2160 

ATTTTCAAGT GGTTCAAACA TCTTTTCAGC CAATTCAGGT CCACTAATAT TAGCGTATCC 2220 

TGGGTAATTT TCGATATCAG ATGTATTATT CATCTGACCA CCTGGCAGAC CACCTTCAAT 2280 

CAAAGCTACT TTTAGATTGC TTCGAGCAGC ATACAAGGCC GCAGTCATCC cTGCAGGTCC 2340 

AGCACCGATA ATAATAGTAT CGTACATATA GATTCCTTCT TTCTTGGTGT AACTATCTTT 2400 

ATTCTAACTC TG 2412 
(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7760 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

CCGATTTGGT GG AATTTTTG TCTCATCATT TAGAAGGTGT TGCAAGAGCA GAGTTTACCT 60 

TGGTGCTTCA TACCAAATTG GGAGAAGCCT CTGTTTTGGC AAATATTGTA GATGTAAACA 120 

AGGATGAATG GATTTTAGGA ACAGTTGCTG GTGCCAATAC CTTATTGGTT ATTTGTCGAG 180 

ATCAGCACGT TGCCAAACTC ATGGAAGATC GTTTGCTAGA TTTGATGAAA GATAAGTAAG 240 
i 

GTCTTGGGAG TTGCTCTCAA GACTTATTTT TGAAAAGGAG AGACAGAAAA TGGCGATAGA 300 

AAAGTTATCA CCCGGCATGC AACAGTATGT GGATATTAAA AAGCAATATC CAGATGCTTT 360 
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TTTGCTCTTT CGGATGGGTG ATTTTTATGA ATTATTTTAT GAGGATGCGG TCAATGCTGC 420 

GCAGATTCTG GAAATTTCCT TAACGAGTCG CAACAAGAAT GCCGACAATC CGATCCCTAT 480 

GGCGGGTGTT CCCTATCATT CTGCCCAACA GTATATCGAT GTCTTGATTG AGCAGGGTTA 540 

TAAGGTGGCT ATCGCAGAGC AGATGGAAGA TCCTAAACAA GCAGTTGGGG TTGTTAAACG 600 

AGAGGTTGTT CAGGTCATTA CGCCAGGGAC AGTGGTCGAT AGCAGTAAGC CGGACAGTCA 660 

GAATAATTTT TTGGTTTCCA TAGACCGCGA AGGCAATCAA TTTGGCCTAG CTTATATGGA 720 

TTTGGTGACG GGTGACTTTT ATGTGACAGG TCTTTTGGAT TTCACGCTGG TTTGTGGGGA 780 

AATCCGTAAC CTCAAGGCTC GAGAAGTGGT GTTGGGTTAT GACTTGTCTG AGGAAGAAGA 840 

ACAAATCCTC AGCCGCCAGA TGAATCTGGT ACTCTCTTAT GAAAAAGAAA GCTTTGAAGA * 900 

CCTTCATTTA TTGGATTTGC GATTGGCAAC GGTGGAGCAA ACGGCATCTA GTAAGCTGCT 960 

CCAGTATGTT CATCGGACTC AGATGAGGGA ATTGAACCAC CTCAAACCTG TTATCCGCTA 1020 

CGAAATTAAG GATTTCTTGC AGATGGATTA TGCGACCAAG GCTAGTCTGG ATTTGGTTGA 1080 

GAATGCTCGC TCAGGTAAGA AACAAGGCAG TCTTTTCTGG CTTTTGGATG AAACCAAAAC 1140 

GGCTATGGGG ATGCGTCTCT TGCGTTCTTG GATTCATCGC CCCTTGATTG ATAAGGAACG 1200 

AATCGTCCAA CGTCAAGAAG TAGTGCAGGT CTTTCTCGAC CATTTCTTTG AGCGTAGTGA 1260 

CTTGACAGAC AGTCTCAAGG GTGTTTATGA CATTGAGCGC TTGGGTAGTC GTGTTTCTTT 1320 

TGGCAAAACC AATCCAAAGG ATCTCTTGCA GTTGGCGACT ACCTTGTCTA GTGTGCCACG 1380 

GATTCGTGCG ATTTTAGAAG GGATGGAGCA ACCTACTCTA GCCTATCTCA TCGCACAACT 1440 

GGATGCAATC CCTGAGTTGG AGAGTTTGAT TAGCGCAGCG ATTGCTCCTG AAGCTCCTCA 1500 

TGTGATTACA GATGGGGGAA TTATCCGGAC TGGATTTGAT GAGACTTTAG ACAAGTATCG 1560 

TTGCGTTCTC AGAGAAGGGA CTAGCTGGAT TGCTGAGATT GAGGCTAAGG AGCGAGAAAA 1620 

CTCTGGTATC AGCACGCTCA AGATTGACTA CAATAAAAAG GATGGCTACT ATTTTCATGT 1680 

GACCAATTCG CAACTAGGAA ATGTGCCAGC TCACTTTTTC CGCAAGGCGA CGCTGAAAAA 1740 
CTCAGAACGC TTTGGAACCG AAGAATTAGC CCGTATCGAG GGAGATATGC TTGAGGCGCG ' 1800 

TGAGAAGTCA GCCAACCTCG AATACGAAAT ATTTATGCGC ATTCGTGAAG AGGTCGGCAA 1860 

GTACATCCAG CGTTTACAAG CTCTAGCCCA AGGAATTGCG ACGGTTGATG TCTTACAGAG 1920 

TCTGGCGGTT GTGGCTGAAA CCCAGCATTT GATTCGACCT GAGTTTGGTG ACGATTCACA 1980 

AATTGATATC CGGAAAGGGC GCCATGCTGT CGTTGAAAAG GTTATGGGGG GTCAGACCTA 2040 

TATTCCAAAT ACGATTCAGA TGGCAGAAGA TACCAGTATT CAACTGGTTA CAGGGCCAAA 2100 
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CATGAGTGGG AAGTCTACCT ATATGCGTCA GTTAGCCATG ACGGCGGTTA TGGCCCAGCT 2160 

GGGTTCCTAT GTTCCTGCTG AAAGCGCCCA TTTACCGATT TTTGATGCGA TTTTTACCCG 2220 

TATCGGAGCA GCAGATGACT TGGTTTCGGG TCAGTCAACC TTTATGGTGG AGATGATGGA 2280 

GGCCAATAAT GCCATTTCGC ATGCGACCAA GAACTCTCTC ATTCTCTTTG ATGAATTGGG 2340 

ACGTGGAACT GCAACTTATG ACGGGATGGC TCTTGCTCAG TCCATCATCG AATATATCCA 2400 

TGAGCACATC GGAGCTAAGA CCCTCTTTGC GACCCACTAC CATGAGTTGA CTAGTCTGGA 2460 

GTCTAGTTTA CAACACTTGG TCAATGTCCA CGTGGCAACT TTGGAGCAGG ATGGGCAGGT 2520 

CACCTTCCTT CACAAGATTG AACCGGGACC AGCTGATAAA TCtACGGTAT CCATGTTGCC 2580 

AAGATTGCTG GCTTGCCAGC AGACCTTTTA GCAAGGGCGG ATAAGATTTT GACTCAGCTA 2 640 

GAGAATCAAG GAACAGAGAG TCCTCCTCCC ATGAGACAAA CTAGTGCTGT CACTGAACAG 2700 

ATTTCACTCT TTGATAGGGC AGAAGAGCAT CCTATCCTAG CAGAATTAGC TAAACTGGAT 2760 

GTGTATAATA TGACACCTAT GCAGGTTATG AATGTCTTAG TAGAGTTAAA ACAGAAACTA 2820 

TAAAACCAAG ACTCACTAGT TAATCTAGCT GTATCAAGGA GACTTCTTTG ACAATTCTCC 2880 

ACTTTTTTGC TAGAATAACA TCACACAAAC AGAATGAAAA GGAGCTGACG CATTGTCGCT 2940 

CCCTTTTGTC TATTTTTTAA GGAGAAAGTA TGCTGATTCA GAAAATAAAA ACCTACAAGT 3000 

GGCAGGCCCT GGCTTCGCTC CTGATGACAG GCTTGATGGT TGCTAGTTCA CTTCTGCAAC 3060 

CGCGTTATCT GCAGGAAGTC TTAGGCGCCC TCCTTACTGG GAAATATGAA GCTATTTATA 3120 

GTATCGGGGC TTGGTTGATT GGTGTGGCCG TAGTCGGTCT AGTTGCTGGT GGACTCAATG 3180 

TTGTCCTCGC AGCCTATATT GCCCAAGGAG TTTCATCCGA CCTTCGGGAG GATGCCTTCC 3240 

GTAAAATTCA AACCTTTTCT TATGCTGATA TTGAACAATT TAATGCGGGA AATCTAGTCG 3300 

TTCGAATGAC AAATGATATC AACCAGATTC AGAACGTTGT CATGATGACC TTCCAAATTC 3360 

TTTTCAGACT TCCCCTCTTG TTCATCGGTT CGTTTATCCT AGCGGTTCAA ACCTTACCTT 3420 

CTCTGTGGTG GGTGATTGTT CTCATGGTAG TCTTGATTTT TGGTTTGACT GCTGTCATGA 3480 

TGGGAATGAT GGGGCCTCGT TTTGCCAAGT TTCAAACCCT TCTTGAGCGC ATCAATGCCA 3540 

TTGCCAAGGA AAATTTACGT GGCGTTCGTG TGGTCAAGTC CTTTGTCCAA GAAAAAGAGC 3600 

AATTTGCTAA GTTTACAGAG GTCTCAGACG AGCTTCTTGG TCAAAACCTT TACATTGGTT 3660 

ATGCCTTTTC AGTAGTGGAA CCCTTTATGA TGTTGGTTGG TTACGGGGCG GTCTTCCTCT 3720 

CTATTTGGCT GGTCGCGGGA ATGGTTCAGT CGGATCCGTC TGTTGTTGGT TCCATCGCTT 3780 

CTTTTGTTAA TTACCTAAGC CAGATTATCT TTACCATTGT TATGGTTGGA TTTTTGGGAA 3840 

ATTCTGTCAG CCGTGCCATG ATTTCCATGC GTCGTATTCG AGAAATTCTT GACGCAGAGC 3900 
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CAGCTATGAC CTTCAAGGAT ATCCCAGATG AAGAGTTGGT TGGAAGTCTT AGCTTTGAAA 3960 

ATGTGACCTT TACCTATCCA ATGGACAAGG AACCGATGCT GAAAGATGTG AGCTTTACTA 4020 

TTGAACCTGG TCAAATGGTT GGTGTAGTTG GAGCGACTGG TGCAGGAAAG TCAACCTTGG 4080 

CTCAATTGAT TCCACGTCTC TTTGATCCAC AGGACGGGGC CATTAAAATC GGTGGCAAGG 4140 

ATATTCGAGA AGTGAGTGAA GGAACCCTGC GTAAAACAGT TTCCATCGTT CTCCAACGTG 4200 

CCATTCTTTT TAGTGGAACG ATTGCAGATA ACTTGAGACA GGGGAAGGGG AATGCTACTC 4260 

TATTTGAAAT GGAGCGCGCA GCCAATATTG CCCAGGCTAG TGAATTCATT CATCGTATGG 4320 

AGAAAACCTT TGAAAGTCCA GTTGAAGAAC GGGGAACCAA TTTCTCTGGT GGACAAAAAC 43 80 

AAAGGATGTC GATTGCGCGT GGGATTGTCA GCAATCCACG TATTCTGATT TTTGATGATT 4440 

CGACCTCAGC CTTGGATGCC AAATCAGAGC GCTTGGTGCA AGAAGCTTTG AATAAGGACT 4500 

TGAAGGGGAC GACAACCATT ATTATTGCTC AAAAAATTAG CTCGGTTGTC CATGCAGACA . 4560 

AGATCTTGGT TCTAAATCAA GGACGATTGA TTGGTCAAGG TACGCATGCA GACTTGGTTG 4620 

CCAACAATGC CGTTTACCGT GAAATCTATG AAACACAGAA ATGAAAGACA AACTATAAGA 4680 

AAAGTCAATA GTTTTATCTA AACTATTTCT TATTTCAATT TGATGATTTG GCGATGATTT 4740 

TAGAGCACGG CAAAAAGCCC TTGAAAAAGT CCATTTTTTC AAAGGTAATC CTGTGTTAAT 4800 

TTCAGAAATT ACATCACTTT TTGTTCGTCA AATGGCAGCT CTTTTTTTAG GATATAAAAC 4860 

AGGGTTCGGA TAAGTTTTTT TGCAAGGTGG ATGATGGCTA CATTGTAATG TTTTCCTTGT 4920 

TCTAATTTAG TCTTAAGATA GGCCTTAAAA GCAGGCGAAA AGCGAGGGCA TGCTTTGGCA 4980 

GCTTGTATGA GTACCTACCG CAGATGAGGG GAACTCCGTT TGACCATTCT TCCTGCTAAA 5040 

TCAATCTGAT CTGACTGATA AATAGAAGAA TCCAGTCCAG CGAAAGCTTG TAATTGAGGA 5100 

GGATTATCAA AGGCATGAAT ATTTCGAATC TCAGCTAAAA TGACCGCCCC TAAACGATCC 5160 

CCAATCCCAG TAACCGTCGT GATGACCGAG TTGAACTCAG CCATCAAGTC ATTGACACAT 5220 

GTTTCCGCCT TGTCAATGAG CCTCTTGTAA TGTTTGATGT TTTCATTACA CGAGATAAAA 5280 

CGTCTATGCG TTATCAAACT CATTACCAAT TAAAACAAAA AGCTGTGGTT AGATCCTTTC 5340 

GGAAATTGTC AAGCGATTGG AGGAAATGAA CTAATCCACA GCGGCTTATT CGAAGTATAC 5400 

CACTTGGGCT TTGGCAGTAG CTAACTGCGC TAAATATAAT ATAAGGAGGA GTAAAATGAA 5460 

GACAGTTCAA TTTTTTTGGC ATTATTTTAA GGTCTACAAG TTCTCATTTG TAGTTGTCAT 5520 

CCTGATGATT GTTCTGGCGA CTTTTGCCCA AGCCCTCTTT CCAGTCTTTT CTGGACAAGG 5580 

GGTGACGCAG CTAGCCAATT TAGTTCAAGC TTATCAAAAT GGCAATCCAG AACTTGTATG 5640 
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TAGTGTAATA TACATGTGTC TCATGACGCG CGTGATTGCA GAATCGACCA ACGAGATGCG 5760 

CAAAGGCCTC TTTGGTAAGC TTGCTCAGTT GACGGTTTCT TTCTTTGACC GTCGACAAGA 5820 

TGGCGATATC CTGTCTCATT TTACCAGTGA TTTGGATAAT ATCCTCCAAG CCTTTAACGA 5880 

AAGCTTGATT CAGGTCATGA GCAATATTGT TTTATACATT GGTCTGATTC TTGTCATGTT 5940 

TTCGAGAAAT GTGACGCTGG CTCTCATCAC CATTGCCAGC ACCCCATTGG CTTTCCTTAT 6000 

GCTGATTTTC ATCGTGAAAA TGGCACGCAA ATACACCAAC CTCCAGCAGA AAGAGGTAGG 6060 

GAAGCTCAAC GCCTATATGG ATGAGAGCAT CTCAGGCCAA AAAGCCGTGA TTGTGCAAGG 6120 

AATTCAAGAG GATATGATGG CAGGATTTCT TGAACAAAAT GAGCGCGTGC GCAAGGCAAC 6180 

CTTTAAAGGA AGAATGTTCT CAGGAATTCT TTTCCCTGTC ATGAATGGGA TGAGCCTGAT 6240 

TAATACAGCC ATCGTCATCT TTGCTGGTTC GGCTGTACTT TTGAATGATA AGTCTATTGA 6300 

AACAAGTACA GCCCTAGGTT TGATTGTTAT GTTTGCACAA TTTTCACAGC AGTACTACCA 6360 

GCCTATTATC CAAGTTGCAG CGAGTTGGGG AAGCCTTCAG TTGGCCTTTA CTGGAGCTGA 6420 

ACGAATTCAG GAAATGTTTG ATGCAGAGGA GGAAATCCGA CCTGAAAAGG CTCCAACCTT 6480 

CACTAAGTTG CAAGAAAGTG TTGAAATCAG TCATATCGTT TTTTCATACT TGCCTGATAA 6540 

ACCTATTTTG AAAGATGTCA GCATTTCTGC CCCTAAAGGC CAGATGACAG CAGTTGTTGG 6600 

GCCGACAGGT TCAGGAAAAA CGACTATTAT GAACCTCATC AATCGCTTTT ATGATGTTGA 6660 

TGCTGGTGGT ATTTATTTTG ATGGTAAAGA CATTCGTGGC TATGACTTAG ATAGTCTTAG 6720 

AAGCAAGGTG GGAATTGTAT TGCAAGATTC GGTCTTGTTT AGCGGAACGA TTAGAGACAA 6780 

TATCCGATTT GGTGTGCCAG ATGCTAGTCA GGAAATGGTT GAGGTAGCAG CAAAAGCAAC 6840 

CCACATTCAC GACTATATCG AAAGTTTGCC TGATAAGTAC GATACTCTTA TTGATGATGA 6900 

CCAGAGCATC TTTTCAACAG GGCAGAAGCA ATTGATTTCA ATCGCTCGAA CCCTGATGAC 6960 

AGATCCAGAA GTTCTCATTC TCGATGAAGC AACTTCAAAC GTAGATACGG TGACAGAAAG 7020 

CAAGATTCAG CATGCCATGG AGGTGGTTGT AGCAGGTAGA ACTAGTTTCG TCATTGCCCA 7080 

CCGCTTGAAA ACCATTCTCA ATGCAGATCA GATTATTGTC CTTAAAGATG GAGAAGTCAT 7140 

TGAACGTGGT AACCACCATG AACTTTTGAA GCTAGGTGGC TTTTATTCAG AACTCTATCA 7200 

CAATCAATTT GTTTTCGAAT AAGAAAGAAG TTGTCCTATG TGGGCAGCTT TTTCTTGTCC 7260 

ATAAAAAATG TTTATCACAG CCTTAAAAAA AACATATTAG ACGAAAGTCA TTTTGAGTGA 7320 

TATGATAGGA CTATCGTTAG CATTCGAAAG GAG AGG CATC ATGGCTAGAA CGGTTGTAGG 7380 

AGTTGCTGCA AATCTATGTC CCGTAGACGC AGAAGGCAAA ATCATTCATT CATCTGTATC 7440 
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TTGTAGATTC GCAGAGATCA TTCGTCAAGT CGGTGGTCTC CCTTTAGTCA TTCCTGTTGG 7500 

TGATGAGTCA GTTGTACGTG ATTATGTGGA AATGATTGAC AAACTCATTT TGACAGGAGG 7560 

CCAAAATGTT CATCCTCAGT TTTATGGAGA GAAAAAGACC GTCGAGAGCG ATGATT AC AA 7620 

TCTGGTCCGT GACGAATTTG AATTGGCACT CTTGAAGGAA GCGCTTCGTC AGAATAAACG 7680 

AATTATGGCA ATCTGTCGCG GTGTCCAACT TGTCAATGTT GCCTTTGGTG GAACCCTCAA 7740 

TCAAGAAATC GAAGGTCAGG 7760 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2723 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 



GAGGTTTTAA 


TTCACTTACC 


TCTsCCGTAT 


CTTTATTTAA 


AATGAATTCT 


TTTACGGTTG 


60 


TATTTCTTGC 


AAAATCTTTT 


ACAACAATCT 


TAATGTTTAG 


TGTCTTGTCT 


ATTATTTGTT 


120 


TAATATCATT 


AAATGATGTA 


TATTCTTTTC 


CATTTATATA 


AATATGTTGT 


TCTTGAATCT 


180 


CACCATCGAA 


TCCATTATTT 


CTTTTATCAT 


TGATGTTAAA 


GACTACAGAT 


TTTCCATCAG 


240 


CATATTCGAT 


ACTAGTATTT 


CCCTTAGGAT 


CAATGTTTAC 


TTCGGGTTTA 


ACATTATCAT 


300 


ATAAAAACTG 


ATAGTGGACT 


CCAACTGCTT 


TAGCATTCAA 


ATCGCTATAG 


CCAGTTTGAA 


360 


GATAAACATT 


TCCATCCATA 


TCTGTTACCT 


TATCTGGAAA 


TCCGTTTGCT 


TTATAGTCTT 


420 


TCATTCCCCA 


GTCCATGATG 


TCACCGTCTT 


TAACATTCAG 


CTTAATATTA 


AAATCTCTAG 


480 


TGTTATCAAT 


GTGTAAATCT 


CCGTAGATTA 


AATAATTATC 


TACAACCGAT 


TCATTAACTC 


540 


TCAATTCCCA 


GTTAAAACCA 


CCCTTATCAG 


AAATCTTACC 


TCTTAAATAA 


AATTCTGGAT 


600 


TTCGTACATA 


AATTTTATTA 


GATTTAGATG 


GATTAAAGTA 


GTTCTTATCC 


ATTGAAAGGT 


660 


TTACTGGTTT 


GGTATCAATA 


AATAACATGG 


AGCCATCTTC 


TTTTATAGCT 


TCTACATTGA 


720 


ACTTATCCTC 


TCCAGTGTAT 


TCTTTATCAT 


CCTTACCAAA 


TAATACAAGT 


TTAGAAGAAT 


780 


CTGTCACAAG ATTTCCGTCT 


TTATCGATAG 


CTTCCCCTTT 


ATCGTTCATT 


TTAAATGTAA 


840 


ACACTTGATA 


CCTTATAATG 


TTAAAGCCGT 


CCAAAGCCGA 


CATTAATACA 


GATTGGGTAC 


900 


TTCTTCCATC 


TTCAACATTT 


CTACTATCAG 


CATAAATTGT 


TGTTTCTGAA 


AGGGCTCTTA 


960 


GATTAGGATT 


GGCCTTTTGT 


ATTTTTGCTA 


TATCTTCCTT 


GCTATAGACT 


CCATTTCCTT 


1020 
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CTAACATATC CGTTTTTCCA GGATTATAGG TAGTCACTTT TAGTGCATAG CCTTTTCTTA 1080 

GAATGATATT ATCCTTTAAC AGATATTGTT GTTTTTCTGA ATCAGAATAG ATTTTACCAG 1140 

ATTCCATTTT AGTTAAATTG TCTGGTTTGT TTTTTGAAAG ATCTCCTTCC CCTAATTCTA 1200 

TGACATTCCC ATAACTTGAT ACATAGGGAT ATTCTGATTT AGTTTCCTTA ATTTTTTCAG 1260 

GCATTCTAAT TTTAATTTCA GCTTTTTTCT GATCATTATC TTTAACAAAT AATCTCATAT 1320 

CTCCTGCAAA AGCTAATCCA TCCACAATAT CATTAATATT AGCGTATAGA TCAAATGTCA 1380 

TCGTTTTTGA GTGGAAATCA TACTTGGTCG CTTTGATTTC TATAGATTTA TAGTTATTCC 1440 

CATAATATAC CTTGGCATTT TTAGAAACAT TACTTATCTT TCCAAGAATT TCAAAGTGTC 1500 

CATCTTTAGA CGGACTTAGA ACACCATAAA TTTTTGATTT GATTTCGTCA AGTTTCTCAG 1560 

TTTCATATTC TAGATCAGTC CCATCATCGT AGGCTATTAT ATTTCCTTTA TCATCGTATT 1620 

TATAATCGTA TTCCTCCATT CTCTTACCAG TTTCACTTGT AAAATCATCA ACTTCTCTAA 1680 

ATTTCTTTTT AATGAGTTTC TTTAAGTCTT TATTTTCAAA GTCTCTAATT GTTGAAATAT 1740 

TTCTATCAAT AGTAAAACTA GATTTTTCTT TAATAGACTC TTCATTTTCT TGATGATGAT 1800 

GTTCTACCCC AGTTGTATCT TTTTTTAGAC TACCCTCTTT TCCATTTCCT AAATTTTTAA I860 

ATTTAGATTC TGCAATCTCG CCAAGCTTTT GATATTTAGA TGAATCTTGA TCAGGATCTA 1920 

CTAGATAATA GGAAATCATC CCCTTTTCAT CAGCCTGATT AGCAAATTTA ATTCTATGAA 1980 

TCTTTGTGAA ATTGCTAGAA CCATCTAATG CAATGACTTC AATGATTTTT CCCCTTAAAT 2040 

CTCCCGCACC TTTAATTTCA TAAATGGTAT TTCCGTCTTT ATCAAGTTTT CTATTTCTTC 2100 

CTTGACCCTC ACCTGCGTAA GTTACTTCAA GATTTTTTTC AACCTCTCCA TCTTCATTAA 2160 

CAAGAGCGGC GCCAGCATAC CAAACTTCGT TCGCAATCTC GTCAAATTTT TCAGGATGTT 2220 

CTTTTTGATC TCTCGCAAAT AGCGTTTCAT TCTTATACTG ATCTTTTACC TTATGATAAG 2280 

TATCCTTTGT AATCAACTTA ATTTTTTCAG GATTTGAAAA ATCAACCGAA ACAATCTTAG 2340 

GGGCGGTGTT ATCAATTTTT ACAGGAATAT AGGAAACCTG CCATGGGTAA TCTTTAGTTA 2400 

ATCTATATTT AAATTTATAG AAATATTGAC CTTCCGCAAT CGGTTCAAAT TGACCTCTTA 2460 

TCTTAGTAGC AGGATCTTGA TTATCCTTAC TTTCTGGTGC ATTTTCTTCT CTACCTCTAG 2520 

GATTATAGAT GAGTCCATCC CACTTCAAGT CACCCCAAAC TTTTAGTTTA GATGATTTGA 2580 

TTCCCTTTGC ATCATTGCTT TTAGAATTTA AAATTCCTCT AATAAAGTGT TCTCTCGAAA 2640 

TGACTTTTAA GTCTCTTTGA TTTTCTCCCT CTTTATTTGT ATTTACTATT GAAATCAATC 2700 

CTTCTTCTGC ACTTCTTAAT ACA 2723 
(2) INFORMATION FOR SEQ ID NO: 65: 



WO 98/18931 



PCT/US97/19588 



539 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11831 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

AAAAAAGTGG GAATGACTCA AATCTTCACT GAAGCTGGCG AATTGATCCC TGTAACAGTT 60 

ATTGAAGCAA CTCCAAACGT TGTTCTTCAA GTTAAAACTG TTGAAACAGA CGGATACAAC 120 

GCTATCCAAG TTGGTTTCGA TGACAAACGC GAAGTATTGA GCAACAAACC TGCTAAAGGA 180 

CATGTAGCGA AAGCTAACAC GGCTCCTAAG CGCTTCATTC GTGAATTCAA AAACGTTGAA 240 

GGCTTGGAAG TTGGTGCTGA AATTACAGTT GAAACATTCG CAGCTGGAGA CGTTGTTGAC 300 

GTAACGGGTA CTTCTAAAGG TAAAGGTTTC CAAGGTGTTA TCAAACGCCA CGGACAATCA 360 

CGTGGACCAA TGGCTCACGG TTCTCGTTAC CACCGTCGTC CAGGTTCTAT GGGGCCTGTT 420 

GCACCTAACC GCGTATTCAA AGGTAAAAAC CTTGCAGGAC GTATGGGTGG CGACCGCGTA 480 

ACAATTCAAA ACCTTGAAGT TGTACAAGTT GTTCCAGAAA AGAACGTTAT CCTTATCAAA 540 

GGTAACGTAC CAGGTGCTAA GAAATCTCTT ATCACTATCA AATCAGCAGT TAAAGCTGGT 600 

AAATAATAAA GAAAGGGGAA ATCAGTCACA ATGGCAAACG TAACATTATT TGACCAAACT 660 

GGTAAAGAAG CTGGCCAAGT TGTTCTTAGC GATGCAGTAT TTGGTATCGA ACCAAATGAA 720 

TCAGTTGTGT TTGATGTAAT CATCAGCCAA CGCGCAAGCC TTCGTCAAGG AACACACGCT 780 

GTTAAAAACC GCTCTGCAGT ATCAGGTGGT GGACGCAAAC CATGGCGTCA AAAAGGAACT 840 

GGACGTGCTC GTCAAGGTTC TATCCGCTCA CCACAATGGC GTGGTGGTGG TGTTGTCTTC 900 

GGACCAACTC CACGTTCATA CGGCTACAAA CTTCCACAAA AAGTTCGTCG CCTAGCTCTT 960 

AAATCAGTTT ACTCTGAAAA AGTTGCTGAA AACAAATTCG TAGCTGTAGA CGCTCTTTCA 1020 

TTTACAGCTC CAAAAACTGC TGAATTTGCA AAAGTTCTTG CAGCATTGAG CATCGATTCT . 1080 

AAAGTTCTTG TTATCCTTGA AGAAGGAAAT GAATTCGCAG CTCTTTCAGC TCGTAACCTT 1140 

CCAAACGTGA AAGTTGCAAC TGCTACAACT GCAAGTGTTC TTGACATCGC AAATAGCGAC 1200 

AAACTTCTTG TCACACAAGC AGCTATCTCT AAAATCGAGG AGGTTCTTGC ATAATGAATT 1260 

TGTATGATGT TATCAAAAAA CCTGTCATCA CTGAAAGCTC AATGGCTCAA CTTGAAGCAG 1320 

GAAAATATGT ATTTGAAGTT GACACTCGTG CACACAAACT TTTGATCAAG CAAGCTGTTG 1380 

AAGCTGCTTT CGAAGGTGTT AAAGTTGCCA ATGTTAACAC AATCAACGTA AAACCAAAAG 1440 
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CTAAACGTGT 


TGGACGTTAC 


ACTGGTTTTA CTAACAAAAC TAAAAAAGCT 


ATCATCACAC 


1500 


TTACAGCTGA TTCTAAAGCA ATCGAGTTGT TTGCTGCTGA AGCTGAATAA 


TCTAAGGAGG • 


1560 


AAATATCGTG 


GGAATTCGTG 


TTTATAAACC AACAACAAAC GGTCGCCGTA 


ATATGACTTG 


1620 


TTTGGATTTC 


GCTGAAATCA 


CAACAAGCAC TCCTGAAAAA TCATTGCTTG .TTGCATTGAA 


1680 


GAGCAAGGCT GGTCGTAACA ACAACGGTCG TATCACAGTT CGTCACCAAG 


UTGGTGGACA 


1740 


CAAACGTTTC 


TACCGTTTGG 


TTGACTTCAA ACGTAATAAA GACAACGTTG 


AAGCAGTTGT 


1800 


TAAAACAATC 


GAGTACGATC 


CAAACCGTTC TGCAAACATC GCTCTTGTAC 


ACTACACTGA 


1860 


CGGTGTGAAA 


GCATACATCA 


TCGCTCCAAA AGGTCTTGAA GTAGGTCAAC 


GTATCGTTTC 


1920 


AGGTCCAGAA GCAGATATCA AAGTCGGAAA CGCTCTTCCA CTTGCTAACA 


TCCCAGTTGG 


1980 


TACTTTGATT 


CACAACATCG 


AGTTGAAACC AGGTCGTGGT GGTGAATTGG 


TACGTGCTGC 


2040 


TGGTGCATCT 


GCTCAAGTAT 


TGGGTTCTGA AGGTAAATAT GTTCTTGTTC 


GTCTTCAATC 


2100 


AGGTGAAGTT 


CGTATGATTC 


TTGGAACTTG CCGTGCTACA GTTGGTGTTG 


TCGGAAACGA 


2160 


ACAACATGGA 


CTTGTAAACC 


TTGGTAAAGC AGGACGTAGC CGTTGGAAAG 


GTATCCGCCC 


2220 


AACAGTTCGT 


GGTTCTGTAA 


TGAACCCTAA CGATCACCCA CACGGTGGTG 


GTGAAGGTAA 


2280 


AGCACCAGTT 


GGTCGTAAAG 


CACCATCTAC TCCATGGGGC AAACCTGCTC 


TTGGTCTTAA 


2340 


AACTCGTAAC 


AAGAAAGCGA 


AATCTGACAA ACTTATCGTT CGTCGTCGCA 


ACGAGAAATA 


2400 


ATATTAAACT 


AGTCGCTTAA 


GCAACTAGTA AATCCGCCAG CTCGGTAGCG 


CTCCATAGGA 


2460 


GTGCAAGCCG 


CTGTGGTACA 


ACATTTAAAG GAGAAAATAT AAAAATGGGA 


CGCAGTCTTA 


2520 


AAAAAGGACC 


TTTCGTCGAT 


GAGCATTTGA TGAAAAAAGT TGAAGCTCAA 


GCTAACGACG 


2580 


AAAAGAAAAA AGTTATTAAA ACTTGGTCAC GTCGTTCAAC GATCTTCCCA 


AGTTTCATTG 


2640 


GTTACACTAT 


TGCAGTTTAT 


GACGGACGTA AACACGTACC TGTTTACATC 


CAAGAAGACA 


2700 


TGGTAGGCCA 


CAAACTTGGT 


GAATTTGCAC CAACTCGTAC TTACAAAGGT 


CACGCTGCAG 


2760 


ACGACAAGAA AACACGTAGA AAATAAGGAG AACATAAATG GCAGAAATTA 


CTTCAGCTAA 


2820 


AGCAATGGCT 


CGTACAGTAC 


GTGTTTCACC TCGTAAATCA CGTCTTGTTC 


TTGATAACAT 


2880 


CCGTGGTAAA AGCGTAGCCG ATGCAATCGC AATCTTGACA TTCACTGCAA 


ACAAAGCTGC 


2940 


TGAAATCATC 


TTGAAAGTTT 


TGAACTCAGC TGTAGCTAAC GCTGAAAACA 


ACTTTGGTTT 


3000 


GGATAAAGCT 


AACTTGGTAG 


TATCTGAAGC ATTCGCAAAC GAAGGACCAA 


CTATGAAACG 


3060 


TTTCCGTCCA 


CGTGCGAAAG 


GTTCAGCTTC ACCAATCAAC AAACGTACAG 


CTCACATCAC 


3120 


TGTAGCTGTT 


GCAGAAAAAT AAGGAGGTAA AATCGTGGGT CAAAAAGTAC ATCCAATTGG 


3180 


TATGCGTGTC GGCATCATCC GTGATTGGGA TGCCAAATGG TATGCTGAAA 


AAGAATACGC 


3240 
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GGATTACCTT CATGAAGATC TTGCAATCCG TAAATTCGTT CAAAAAGAAC TTGCTGACGC 


3300 


AGCAGTTTCA ACTATTGAAA TCGAACGCGC 


AGTAAACAAA 


GTTAACGTTT 


CACTTCACAC 


3360 


TGCTAAACCA GGTATGGTTA TCGGTAAAGG 


TGGTGCTAAC 


GTTGATGCaC 


TCCGTGCAAA 


3420 


ACTTAACAAA TTGACTGGAA AACAAGTACA 


CATCAACATC 


ATCGAAATCA 


AACAACCTGA 


3480 


TTTGGATGCT CACCTTGTAG GTGAAGGAAT 


TGCTCGTCAA 


TTGGAGGAAC 


GTGTTGCTTT 


3540 


CCGTCGTGCA CAAAAACAAG CAATCCAACG 


TGCAATGCGT 


GCTGGAGCTA 


AAGGAATCAA 


3600 


AACTCAAGTA TCAGGTCGTT TGAACGGTGC 


AGATATCGCC 


CGTGCTGAAG 


GATACTCTGA 


3660 


AGGAACTGTT CCGCTTCACA CACTTCGTGC 


AGATATCGAT 


TAGGGTTGGG 


AAGAAGCAGA 


3720 


TACTACATAC GGTAAACTTG GTGTTAAAGT 


ATGGATCTAC 


CGTGGTGAAG 


TTCTTCCAGC 


3780 


TCGTAAAAAC ACTAAAGGAG GTAAATAACC 


AATGTTAGTA 


CCTAAACGTG 


TTAAACACCG 


3840 


TCGTGAGTTC CGTGGAAAAA TGCGCGGTGA 


AGCAAAAGGT 


GGAAAAGAAG 


TAGCATTCGG 


3900 


TGAATACGGT CTTCAAGCTA CAACTAGCCA 


CTGGATCACT 


AACCGCGAAA 


TCGAAGCTGC 


3960 


TCGTATCGCC ATGACTCGTT ACATGAAACG 


TGGTGGTAAA 


GTTTGGATTA 


AAATCTTCCC 


4020 


ACACAAATCA TACACTGCTA AAGCTATCGG TGTGCGTATG GGATCTGGTA AAGGGGCACC 


4080 


TGAAGGTTGG GTAGCACCAG TTAAACGTGG 


TAAAGTGATG 


TTCGAAATCG 


CTGGTGTATC 


4140 


TGAAGAGATT GCACGTGAAG CGCTTCGACT 


TGCTAGCCAC 


AAATTGGCAG 


TTAAATGTAA 


4200 


ATTCGTAAAA CGTGAAGCAG AATAAGGAGA 


AGGCATGAAA 


CTTAATGAAG 


TAAAAGAATT 


4260 


TGTTAAAGAA CTTCGTGGTC TTTCTCAAGA AGAACTCGCG AAGCGCGAAA ACGAATTGAA 


4320 


AAAAGAATTG TTTGAACTTC GTTTCCAAGC 


TGCTACTGGT 


CAATTGGAAC 


AAACAGCTCG 


4380 


CTTGAAAGAA GTTAAAAAAC AAATCGCTCG CATCAAAACA GTTCAATGTG AAGGGAAATA 


4440 


ATAGACTAGG GAAGGAGAAA TTTCAATGGA ACGCAATAAT CGTAAAGTTC TTGTTGGACG 


4500 


TGTTGTATCT GACAAAATGG ACAAGACAAT 


CACAGTTGTA 


GTTGAAACAA 


AACGTAACGA 


4560 


CCCAGTCTAT GGTAAACGTA TTAACTACTC 


TAAAAAATAC 


AAAGGTCATG 


ATGAAAACAA 


4620 


TGTTGCCAAA GAAGGCGATA TCGTACGTAT 


CATGGAAACT 


CGCCCGCTTT 


CAGCTACAAA 


4680 


ACGTTTCCGT CTTGTAGAAG TTGTTGAAGA 


AGCGGTCATC 


ATCTAATCAA 


ACCTGAAAGG 


4740 


AGAAAACTGA AATGATTCAA ACAGAAACTC 


GTTTGAAAGT 


CGCAGACAAC 


AGCGGTGCTC 


4800 


GCGAAATCTT GACTATCAAA GTTCTTGGTG 


GTTCAGGACG 


TAAATTTGCA 


AACATCGGTG 


4860 


ATGTTATCGT GGCATCTGTA AAACAAGCTA 


CTCCTGGTGG 


TGCGGTTAAA 


AAAGGTGAGG 


4920 


TTGTTAAAGC AGTTATCGTT CGTACTAAAT 


CAGGTGCTCG TCGTGCTGAT GGTTCATACA 


4980 
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TCAAATTTGA CGAAAACGCA GCAGTTATCA TCCGTGAAGA CAAAACTCCT CGCGGAACAC 5040 
GTATCTTTGG CCCAGTTGCA CGTGAATTGC GTGAAGGTGG CTTCATGAAG ATCGTGTCAC 5100 
TTGCTCCAGA AGTACTTTAA TTTTTAGGAA CAAACTAGTC CCCTAGCTTC AAGCTAGGGT 5160 
GCCCTTATGG GCGTAAGAAA AATCAAGGAG AAACCTAATG TTTGTAAAAA AAGGCGACAA . 5220 

AGTTCGCGTA ATCGCTGGTA AAGATAAGGG AACAGAAGCT GTTGTCCTTA CTGCCCTTCC 5280 

AAAAGTAAAC AAAGTTATCG TTGAAGGTGT TAACATTGTT AAGAAACACC AACGTCCAAC 5340 

TAACGAGCTT CCTCAAGGTG GTATCATCGA GAAAGAAGCA GCTATCCACG TATCAAACGT 5400 

TCAAGTTTTG GACAAAAATG GTGTAGCTGG TCGTGTTGGA TACAAATTTG TAGACGGTAA 5460 

AAAAGTTCGC TACAACAAAA AATCAGGCGA AGTGCTTGAT TAATCACGAA GGAAAGGAGA 5520 

AGTATAATGG CAAATCGTTT AAAAGAAAAA TATCTTAATG AAGTAGTTCC TGCTTTGACA 5580 

GAACAATTCA ACTACTCATC AGTGATGGCT GTGCCTAAAG TAGATAAGAT TGTTTTGAAC 5640 

ATGGGTGTTG GTGAAGCTGT ATCAAACGCT AAAAGCCTTG AAAAAGCTGC TGAAGAATTG 5700 

GCACTTATCT CAGGTCAAAA ACCACTTATC ACTAAAGCTA AAAAATCAAT CGCCGGCTTC 5760 

CGTCTTCGTG AAGGTGTTGC GATCGGTGCA AAAGTTACCC TTCGTGGTGA ACGTATGTAC 5820 

GAATTCTTGG ATAAATTGGT ATCAGTTTCA CTTCCACGTG TACGTGACTT CCACGGTGTC 5880 

CCAACAAAAT CATTTGATGG ACGCGGGAAC TACACACTTG GTGTGAAAGA ACAATTAATC 5940 

TTCCCAGAAA TCAACTTCGA TGACGTTGAC AAAACTCGTG GTCTTGACAT CGTTATCGTA 6000 

ACAACTGCTA ACACTGACGA AGAGTCACGT GCATTGCTTA CAGGCCTTGG AATGCCTTTT 6060 

GCAAAATAAT ATAGGAGGTA AATCTAATGG CTAAAAAATC AATGGTAGCT AGAGAGGCTA 6120 

AACGCCAAAA AATTGTTGAC CGTTATGCTG AAAAACGTGC TGCATTAAAG GCGGCAGGGG 6180 

ACTACGAAGG TTTATCTAAA TTACCTCGCA ACGCCTCACC GACTCGTTTA CATAATCGTT 6240 

GTAGGGTTAC GGGGCGCCCA CATTCAGTTT ACCGCAAATT TGGTCTGAGT CGTATCGCTT 6300 

TTCGCGAACT TGCGCATAAA GGTCAAATTC CTGGTGTAAC AAAAGCATCT TGGTAATTTA 6360 

AGATATCAAG AGCGTCAAAA CTCCAAGTAA AAATAGGAAA CTTGACGAAG AAACTAAAGT 6420 

TTCTAGGAAA GTTTATCTTT TTCACACAGA GTTTAGCCCG GGTTCAATTG GGCTTGCCAA 6480 

TTTGAACACG AGCTACAGCT TTGGCAAAAA AGACCAATTT GCTTTGGAGC ATTGCTTCTG 6540 

CATTAAATTG TCTATTTTTG CTCGTGCTGT TACGCTCTTT GTATCATGTA TTAACTAGCA 6600 

AGTGCAACTT GCAAACTACT AGTAAGAGGA GAAAAACAAA ATGGTTATGA CTGACCCAAT 6660 

CGCAGACTTC CTAACTCGTA TTCGTAATGC TAACCAAGCT AAACACGAAG TACTTGAAGT 6720 

ACCTGCATCA AACATCAAAA AAGGGATTGC TGAAATCCTT AAACGCGAAG GTTTTGTAAA 6780 
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AAACGTTGAA 


ATCATTGAAG 


ATGACAAACA 


AGGCGTCATC 


CGTGTATTTC 


TTAAATACGG 


684G 


ACCAAATGGT 


GAGAAAGTTA 


TCACTAACTT 


GAAACGTGTT 


TCTAAACCAG 


GACTTCGTGT 


6900 


CTACAAAAAA 


CGTGAAGACC 


TTCCAAAAGT 


TCTTAACGGA 


CTTGGAATTG 


CCATCCTTTC 


6960 


AACTTCTGAA 


GGTTTGCTTA 


CTGATAAAGA 


AGCACGCCAA 


AAGAATGTTG 


GTGGTGAGGT 


7020 


TATCGCTTAC 


GTTTGGTAAA 


ATCAAGATAC 


AAAGCTCGTA 


AAGAACAAAG 


CAAAATTAGG 


7080 


AAGTTGGAGA 


AGTTTGTTTA 


CAAACAAGCC 


AACTTATCTA 


TTTTGCACAG 


TTCTTAGAGC 


7140 


GTGTTCAGTT 


CAGCTCTTGA ACTAAATAAG 


TATCTGAACC 


CCGTGAAAAC 


TGGCCGTTCT 


7200 


GGCCTGACAA 


TTTAACAGGA 


GAAAATAAAC 


ATGTCACGTA 


TTGGTAATAA 


AGTTATCGTG 


7260 


TTGCCTGCTG 


GTGTTGAACT 


CGCTAACAAT 


GACAACGTTG 


TAACTGTAAA 


AGGATCTAAA 


7320 


GGAGAACTTA 


CTCGTGAGTT 


CTCAAAAGAT 


ATTGAAATCC 


GTGTGGAAGG 


TACTGAAATA 


7380 


ACTCTTCACC 


GTCCAAACGA 


TTCAAAAGAA 


ATGAAAACTA 


TCCACGGAAC 


TACTCGTGCC 


7440 


CTTTTGAACA 


ACATGGTTGT 


TGGTGTATCA 


GAAGGATTCA 


AGAAAGAACT 


TGAAATGCGT 


7500 


GGGGTTGGTT 


ACCGTGCACA 


GCTTCAAGGA 


TCTAAACTTG 


TTTTGGCTGT 


TGGTAAATCT. 


7560 


CATCCAGACG 


AAGTTGAAGC 


TCCAGAAGGA 


ATTACTTTTG 


AACTTCCAAA 


CCCAACAACA 


7620 


ATCGTTGTTA 


GCGGAATTTC 


AAAAGAAGTA 


GTTGGTCAAA 


CAGCTGCTTA 


CGTACGTAGC 


7680 


CTTCGTTCAC 


CAGAACGATA 


TAAAGGTAAA 


GGTATCCGTT 


ACGTTGGTGA 


ATTCGTTCGC 


7740 


CGTAAAGAAG 


GTAAAACAGG 


TAAATAATGT 


TGAGTGGTTG 


ATCATCAACC 


ACCAACCTAT 


7800 


TTTCCAACTT 


TGTGCATAGC 


ACACGATTTA 


AAACTAAAGA 


GGTGAAAACT 


GTGATTTCAA 


7860 


AACCAGATAA 


AAACAAACTC 


CGCCAAAAAC 


GCCACCGTCG 


CGTTCGCGGA 


AAACTCTCTG 


7920 


GAACTGCTGA 


TCGCCCACGT 


TTGAACGTAT 


TCCGTTCTAA 


TACAGGCATC 


TACGCTCAAG 


7980 


TGATTGATGA 


CGTAGCGGGT 


GTAACGCTCG 


CAAGTGCTTC 


AACTCTTGAT 


AAAGAAGTTT 


8040 


CAAAAGGAAC 


TAAAACTGAA 


CAAGCCGTTG 


CTGTCGGTAA 


ACTCGTTGCA 


GAACGTGCAA 


8100 


ACGCTAAAGG 


TATTTCAGAA 


GTGGTGTTCG 


ACCGCGGTGG 


ATATCTATAT 


CACGGACGTG 


8160 


TGAAAGCTTT 


GGCTGATGCA 


GCTCGTGAAA 


ACGGATTGAA 


ATTCTAATAG 


GAGGACACTA 


8220 


GAAAATGGCA 


TTTAAAGACA ATGCAGTTGA 


ATTAGAAGAA 


CGCGTAGTTG 


CTGTCAACCG 


8280 


TGTTACAAAA 


GTTGTTAAAG 


GTGGACGTCG 


TCTTCGTTTC 


GCAGCTCTTG 


TTGTTGTTGG 


8340 


TGACCACAAT 


GGTCGCGTAG 


GATTTGGTAC 


TGGTAAAGCT 


CAAGAAGTTC 


CAGAAGCAAT 


8400 


CCGTAAAGCA 


GTAGATGATG 


CTAAGAAAAA 


CTTGATCGAA 


GTTCCTATGG 


TTGGAACAAC 


-8460 


AATCCCACAC 


GAAGTTCTTT 


CAGAATTCGG 


TGGAGCTAAA 


GTATTGTTGA 


AACCTGCTGT - 


8520 
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AGAAGGTTCT GGAGTTGCCG 


CTGGTGGTGC AGTTCGTGCC GTTGTGGAAT TGGCAGGTGT 


8580 


GGCAGATATT ACATCTAAAT 


CACTTGGTTC 


TAACACTCGA 


ATCAACATTG 


TTCGTGCAAC 


8640 


TGTTGAAGGT TTGAAACAAT 


TGAAACGCGC 


TGAAGAAATT 


GCTGCCCTTC 


GTGGTATTTC 


8700 


AGTTTCTGAT TTGGCATAAG 


AAAGGGGATA 


AAATGGCTCA 


AATTAAAATT 


ACTTTGACTA 


8760 


AGTCTCCAAT CGGACGCATT 


CCATCACAAC 


GTAAAACTGT 


TGTAGCACTT 


GGACTTGGCA 


8820 


AATTGAACAG CTCTGTTATT 


AAAGAAGATA 


ACGCTGCTAT 


CCGTGGTATG 


ATCACAGCAG 


8880 


TATCTCACTT ■ AGTAACAGTT 


GAAGAAGTAA 


ACTAATGAaG 


TTTTAGGGGA 


TGTGCACTGT 


8940 


ACCATCCCCT AAAACTAGAT 


ATAGTCATCT 


ATGATGACAT 


CGTATAGGCG 


AGTTGATGGG 


9000 


GGAGACAACC TTTTCTCCCT TATCGGCGCT AGCATTTTAC AAAAGAGGAG AAAATAAAAA 


9060 


TGAAACTTCA TGAATTGAAA 


CCTGCAGAAG 


GTTCTCGTAA 


AGTACGTAAC 


CGCGTTGGTC 


9120 


GTGGTACTTC ATCAGGTAAC 


GGTAAAACAT 


CTGGTCGTGG 


TCAAAAAGGT 


CAAAAAGCTC 


9180 


GTAGCGGTGG CGGAGTTCGC 


CTTGGTTTTG 


AAGGTGGACA 


AACTCCATTG 


TTCCGTCGTC 


9240 


TTCCAAAACG TGGATTCACT 


AACATCAACG 


CTAAAGAATA 


CGCAATTGTG 


AACCTTGACC 


9300 


AATTGAACGT CTTTGAAGAT 


GGTGCTGAAG 


TAACTCCAGT 


TGTTCTTATC 


GAAGCAGGAA 


9360 


TTGTTAAAGC TGAAAAGTCA 


GGTATTAAAA 


TTCTTGGTAA 


CGGTGAGTTG 


ACTAAGAAAT 


9420 


TGACTGTGAA AGCAGCTAAA 


TTCTCTAAAT 


CAGCTGAAGA 


AGCTATCACT 


GCTAAAGGTG 


9480 


GTTCAGTAGA AGTCATCTAA 


GAGAGGTGAC 


CTATGTTTTT 


TAAATTATTA 


AGAGAAGCTC 


9540 


TTAAAGTCAA GCAGGTTCGA TCAAAAATTT TATTTACAAT TTTTATCGTT 


TTGGTCTTTC 


9600 


GTATCGGAAC TAGCATTACA 


GTTCCTGGTG 


TGAATGCCAA 


TAGCTTGAAT 


GCTTTAAGTG 


9660 


GATTATCCTT CTTAAACATG 


TTGAGCTTGG 


TGTCGGGGAA 


TGCCCTAAAA 


AACTTTTCGA 


9720 


TTTTTGCCCT AGGAGTTAGT 


CCCTATATCA 


CCGCTTCTAT 


TGTTGTCCAA 


CTCTTGCAAA 


9780 


TGGATATTTT ACCCAAGTTT 


GTAGAGTGGG 


GTAAACAAGG 


GGAAGTAGGT 


CGAAGAAAAT 


9840 


TGAATCAAGC TACTCGTTAT 


ATTGCTCTAG 


TTCTCGCTTT 


TGTGCAATCT 


ATCGGGATTA 


9900 


CAGCTGGTTT TAATACCTTG 


GCTGGAGCTC 


AATTGATTAA 


AACTGCTTTA 


ACTCCACAAG 


9960 


TTTTTCTGAC GATTGGTATC 


ATCTTAACAG 


CTGGTAGTAT 


GATTGTCACT 


TGGTTGGGTG 


10020 


AGCAAATTAC AGATAAGGGA 


TACGGAAACG 


GTGTTTCCAT 


GATTATCTTT 


GCCGGGATTG 


10080 


TTTCCTCAAT TCCAGAGATG 


ATTCAGGGCA 


TCTATGTGGA 


CTACTTTGTG 


AACGTCCCAA 


10140 


GTAGCCGTAT CACTTCATCT 


ATCATTTTCG 


TAATCATTTT 


GATTATTACT 


GTATTGTTGA 


10200 


TTATTTACTT TACAACTTAT GTTCAACAAG CAGAATACAA AATTCCAATC 


CAATATACTA 


10260 


AGGTTGCACA AGGTGCTCCA 


TCTAGCTCTT 


ACCTTCCGTT 


AAAAGTAAAC 


CCTGCTGGAG 


10320 
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TTATCCCTGT 


TATCTTTGCC AGTTCGATTA CTGCAGCcTG 


CGGCTATTCT 


TCAGTTTTTG 


10380 


AGTGCCACAG 


GTCATGATTG GGCTTGGGTA AGGGTAGCAC AAGAGATGTT 


GGCAACTACT 


10440 


TCTCCAACTG 


GTATTGCCAT GTATGCTTTG TTGATTATTC 


TCTTTACATT 


CTTCTATACG 


10500 


TTTGTACAGA TTAATCCTGA AAAAGCAGCA GAGAJcCCTAC AAAAGAGTGG 


TGCCTATATC 


10560 


CATGGAGTTC 


GTCCTGGTAA AGGTACAGAA GAATATATGT 


CTAAACTTCT 


TCGTCGTCTT 


10620 


GCAACTGTTG 


GTTCCCTCTT CCTTGGTGTG ATTTCCATTT 


TACCGATTGC 


AGCTAAAGAT 


10680 


GTATTTGGTC 


TTTCTGATGT TGTTGCCTTT GGTGGAACAA 


GTCTCTTGAT 


CATTATCTCT 


10740 


ACAGGTATCG 


AAGGAATCAA GCAATTGGAA GGTTACCTAT 


TGAAACGTAA 


GTATGTTGGT 


10800 


TTCATGGACA GAACAGAATA AAAGTATTTA CTGAATCAGT AAATACTGAG GGAGTGGAGG^ 


10660 


TTTAAACTCT 


GACATTTGTA AGAGTTGGAT CTCCCCTCTT 


CTATTTTGTT 


TTTAAATCGG 


10920 


GGTGAAAAGA 


CTTTTTGCTT CTATTTAAAA ATAAAATAAG 


GAGATCAAAT 


CATGAATCTT 


10980 


TTGATTATGG 


GCTTACCTGG TGCAGGTAAG GGAACTCAAG 


CAGCAAAAAT 


CGTAGAACAA 


11040 


TTCCATGTTG 


CACATATCTC AACAGGTGAT ATGTTCCGCG 


CTGCAATGGC 


AAATCAAACT 


11100 


GAAATGGGTG 


TTCTTGCTAA GTCATATATT GACAAGGGTG 


AATTGGTTCC 


TGACGAAGTT 


11160 


ACAAATGGAA 


TCGTAAAAGA ACGCCTTTCA CAAGATGATA 


TTAAAGAAAC 


AGGATTCTTA 


11220 


TTGGATGGTT 


ACCCACGTAC AATTGAACAA GCTCATGCCT 


TGGACAAAAC 


ATTGGCTGAA 


11280 


CTTGGCATTG AACTAGAAGG TGTTATCAAT ATTGAAGTGA ACCCTGACAG CCTTTTGGAA 


11340 


CGTTTGAGTG 


GGCGTATCAT CCACCGCGTA ACTGGAGAAA 


CTTTCCACAA 


GGTCTTTAAC 


11400 


CCACCAGTTG 


ACTATAAAGA AGAAGATTAC TACCAACGTG 


AAGATGATAA 


GCCTGAGACA 


11460 


GTAAAACGTC 


GTTTGGATGT TAATATTGCT CAAGGAGAAC 


CAATCATTGC 


TCACTACCGT 


11520 


GCCAAAGGTT 


TGGTTCATGA CATCGAAGGT AATCAAGATA 


TCAATGATGT 


CTTCTCAGAT 


11580 


ATTGAAAAAG 


TATTGACAAA TTTGAAATAA AGCGTTTTTC 


ACACTTGCAA 


AAATCCGCTA 


11640 


CAAATGTTAT 


ACTGAGATAG TCTGACTTAT AATTGTTGTC 


TCTGTGTCTA 


GAGGCATCGA 


11700 


ATCGAAATTT 


ATGGAGGTGC TTTTGCGTGG CAAAAGACGA 


TGTGATTGAA 


GTTGAAGGCA 


11760 


AAGTAGTTGA 


TACAATGCCG AATGCAATGT TTACGGTTGA 


ACTTGAAAAT 


GGACATCAGA 


11820 


TTTTAGCAGG 


G 






11831 


(2) INFORMATION FOR SEQ ID NO: 66: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 10726 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

CCCGGCATTT GAAAGCTATT CGTGAAGGAT TTATGATGGC AATGCCTTTG ATTTTAGTCG 60 

GCTCTTTATT TCTTATTCTA ATCAGTTGGC CTCAAGAGGC TTTTACAAAT TGGCTGAATA 120 

GTGTTGGATT GCTAAGTATC TTGACAACTA TGAATCAGTC AACAGTAGCG ATTATCTCCT 180 

TGGTCGCTTG TTTCGGTATT GCCTACAGGT TGTCGGAAGG ATATGGTACA GATGGTCCGT 240 

CGGCAGGGAT CATAGCCTTA TCCAGTTTTG TATTGATGGC ACCTCGTTTT TCGAGTATGG 300 

TTTATGATAA AAATGGGGAG CAGGTCAAGC AGTTATTTGG CGGCGCAATA CCATTTTCTA 360 

GCCTGAATGC ATCTTCTTTG TTTATGGCGA TTACTATTGG ATTGGTTACA GCAGAGATTT 420 

ATCGTATGTT TATCCAGCGC GGAATTACGA TAAAAATGCC AAGTGGTGTC CCAGATGTAG 480 

TAAGTAAATC ATTTTCAGCT CTTTTATCTG GTTTTACTAC TTTTGTTTTG TGGGCTTTGG 540 

TCTTAAAAGG TCTTGAAGCG GCAGGAGTTG CAGGAGGTCT CAACGGACTC CTAGGTGCAA 600 

TTGTTGGAAC ACCGCTTAAG TTAATTGCAG GAACGCTTCC AGGTATGATT CTATGTGTTA 660 

TTGTAAACTC ATTCTTTTGG TTCTGTGGAG TTAATGGGGG ACAAGTTTTA AATGCTTTTG 720 

TAGACCCAGT TTGGTTACAA TTTACTACAG AAAACCAAGA AGCTGTGGCT GCAGGACAAA 780 

CACTCCAACA CATTATTACA TTACCGTTTA AAGATTTATT TGTATTTATT GGTGGCGGTG 840 

GAGCGACTAT TGGTCTTGCG ATTTGTCTCT TCCTATTTAG TAAGAGTCGT GCGAATAAAA 900 

CATTAGGTAA GCTAGCTATT ATACCGTCTA TTTTTAATAT CAATACAGCT ATTCTATTTA 960 

CGTTTCCAAC AGTTTTAAAT CCGATTATGC TGATTCCGTT TATTGCTACT CCTACAATCA 1020 
ATGCCTTGAT TACCTATGTA TCAATGGCTG TAGGATTAGT ACCCTATACA ACAGGTGTAA ' 1080 

TCCTTCCGTG GACAATGCCA CCGATTATAG GAGGCTTCCT TGCAACAGGG GCTAGTTGGC 1140 

GAGGAGCTCT ATTACAAGTT GTTTTGATTT TGGTTTCTGT AGCAATTTAT TATCCATTCT 1200 

TCAAAATTGC AGATAAACGC AATCTTGAAA AAGAAAAAGC TACTGTTGGA GGGAAATAAG 1260 

ATGGTTATCA GAGTATTTGA TCAACAGAAA AATACTTATT CTAGCTTTGC CTTAGAGGAA 1320 

TTAAGTTACT ATATGAATCG GGTCTTTAAG ACTAACATAG AGCTTGTCGA GGAGAAGGAA 1380 

GCGGATATTT TTGTAGGATT AGTCAATAAA GAGGACAGAA AAGACCATGT TCTTATCTCA 1440 

TTAGACAAGG GTAAGGGGAG AATTGAGTCT AATACAATTG TAGGTTTACT TATTGGAATT 1500 

TACCGAATGT TTCATGAATT TGGGGTTGTG TATACTAGAC CAGGGCGCAG ACATGACTTT 1560 

GTTCCAGAGT TACGATTTGA AGATTTTTTA GATAAACAGC TATCTATAGA TGAAACAGCC 1620 
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AGTTACTATC 


ATAGGGGAGT 


ATGTATAGAG 


GGAGCGGATT 


CATTTGAAAA 


TATACTAGAT 


1680 


TTCATTGATT 


GGCTACCTAA 


GATTGGGATG 


AACAGTTTTT 


TCATCCAGTT 


TGAAAATCCT 


1740 


TACTCTTTTT 


TGAAACGTTG 


GTATGAACAT 


GAATTTAATC 


CATATCTAAA 


TAAAGAACAA 


1800 


TTTTCAAATG AATTAGTACA AGAATTGAGT GATAGGTTGG ATAAAGAATT GCAAAAAAGA 


1860 


GGTCTTATTC 


ATCATCGTGT 


TGGTCATGGA TGGACAGGTG AAGTTTTAGG 


TTACTCTTCA 


.1920 


AAATTTGGCT 


GGGAATCAGG 


TCTTAGTATT 


TCAGAGGAGA 


AGAAACCCTA 


TGTCGCTGAA 


1980 


ATAAACGGGA 


AACGAGAATT 


GTTTAATACG 


GCTCCGATTT 


TAACCAGCCT 


GGATTTTTCA 


2040 


AATCCAGATG 


TAGCTGATAA 


GATGGTAGAA 


ATTATCAAGG 


ATTATGCCAA 


GAAAAGACCT 


2100 


GATGTTAACT 


ACTTACATGT 


ATGGTTGTCG 


GATGCTCGTA 


ATAATATTTG 


TGAATGCGAA 


2160 


AACTGTAGAC 


AAGAATTGGT 


TTCGGATCAG 


TATATTCGTA 


TTCTCAATCA 


ATTGGATAGG 


2220 


GCTTTAACGA 


GTGAGGGATT 


AGATACAAAG 


ATTTGTTTTC 


TGCTTTATCA 


TGAGTTGTTA 


. 2280 


TGGGCACCTC 


AGAAAGAAAA 


ATTAGATAAT 


CCTGAACGCT 


TTACCATGAT 


GTTTGCACCG 


2340 


ATTACAAGAA 


CATTTGAAAT 


GAGTTATGCA 


GATGTAGATT 


TTGACAATTC 


CATACCTACG 


2400 


CCTAAACCTT 


ATATGCGTAA 


TAAAATTATA 


CTTCCGAATT 


CTCTTGAGGA 


AAATTTATCT 


2460 


TATCTTTTTG 


AGTGGCAAAA 


AGCATTTAAA 


GGAGATAGTT 


TCGTATATGA 


CTATCCTTTA 


2520 


GGGCGTGCTC 


ATTATGGCGA 


TTTAGGCTAT 


ATGAAAATTA 


GTCAAACTAT 


TTACAGAGAT 


2580 


GTATCTTATC 


TTTCCAACCT 


ACATTTGAAC 


GGGTACATTT 


CGTGTCAAGA 


ATTACGTGCC 


2640 


GGATTCCCTC 


ATAATTTTCC 


TAATTATGTC 


ATGGGGGAAA 


TGCTCTGGAA 


GAAGACAAGA 


2700 


AGTTATGAAG 


AATTGATTGA 


AGAATACTTT 


TCTGCTTTGT 


ATGGGGAAAA 


TTGGCAGTCT 


2760 


GTTGTTGAAT 


ATTTAGAAAA 


ATTATCCATT 


TATTCCTCTT 


GTGATTATTT 


TAATGCAATT 


2820 


GGCAGCCGTC 


AAAGTGATGT 


TTTAGCGAAT 


CATTATTATA 


TAGCTTACAA 


TCTAGCTGAT 


2880 


AATTTTTTAC 


CAATTATTGA 


GGAAAATATT 


TCTAAGTTAT 


TAAATAGTCA 


AAAGGATGAA 


2940 


TGGAAACAGC TCAGTTATCA TCGTGAATAT GTTGTTAAGA TGGCGAAGGC TTTATATCTT 


3000 


CAAGCAACTG 


GAAAAACAAG 


GCAAGCTCAA 


GATGAATGGA 


GAAATGTGTT 


GAATTATATC 


3060 


CGTGGGCACG 


AATTGCTATT 


TCAATCTAAT 


TTGGATGTTT 


ATCGTGTAAT 


TGAAGTAGCA 


3120 


AAAAATTACG 


CTGGTTTCCA 


CTTATAAATC 


ATAAGTATAG 


AAAATGAACT 


AAGGTATTCA 


3180 


GAGAAGATTG 


ATCCTAAATA 


TTATGAAATT 


TAAGGATTTT 


TAAGATATTT 


AGGGTCAACT 


3240 


TTCTATTTAT 


ATCGTAGCGA 


AGTCATTTTA 


ATAATGATGT 


GTAAAAGATG 


GATCAAGATT 


3300 


GAGGAGGAAG 


AAAGATGAAA 


TCAAAAGAAG 


AAATAAATAT 


GCTTGGTTTT 


ACAATTGTCG 


3360 
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CTTACGCAGG AGATGCAAGG TCAGATTTGA TGGATGCTTT GGCGTTTGCG AGAGATGGAT 3420 

ATTTTGAACA GGCAAGAGAA TTGGTTGAGT CTGCAAACGA CTCAATAGTG TCTGCCCATC 3480 

GAGAACAGAC TAATTTATTA GCGGAGGAGG CATATGGAGA TAATTTTGAA GTGAGCTTTA 3540 

TTATGATTCA TGGTCAAGAT ACTTTGATGA CAACGATGCT ATTGTATGAT CAGGTAAAGT 3600 

TTTTTATTGA TGAATATGAA CGAATTCGAA AGATTGAAGA ACATATTGGT TTGCAATGAG 3660 

GATTAGTCAT GGAAAATTTA CAGGTTAAAG CCTTACCGAA GGAGTTTTTA TTAGGAACTG 3720 

CTACCGCTGC TTATCAAGTA GAGGGTGCAA CTAGGGTAGA TGGCAAAGGA ATAAATATGT 3780 

GGGATGTTTA TTTGCAAGAA AATAGTCCGT TCTTACCAGA TCCAGCTAGT GATTTTTATT 3840 

ATCGTTACGA AGAGGATATA GCTTTGGCGG CAGAACATGG TTTGCAGGCT TTGCGTTTAT 3900 

CTATTTCTTG GGTTCGTATA TTTCCTGATA TAGATGGGGA TGCTAATGTA TTAGCTGTTC 3960 

ATTATTACCA TAGAGTTTTT CAGTCTTGCT TAAAACATAA TGTGATTCCG TTTGTTTCTT 4020 

TACATCATTT TGATTCGCCT CAGAAAATGT TAGAAACAGG GGATTGGTTG AACAGAGAGA 4080 

ATATTGATCG TTTCATACGA TATGCTCGCT TTTGTTTCCA AGAATTTACA GAAGTCAAGC 4140 

ATTGGTTTAC AATCAATGAA CTGATGTCTC TTGCTGCAGG TCAATATATA GGAGGTCAGT 4200 

TTCCTCCAAA TCATCATTTT CAATTATCTG AAGCAATTCA AGCGAATCAT AATATGTTGT 4260 

TGGCGCATGC TCTTGCAGTC CTCGAATTTC ATCAATTAGG GATTGAGGGA AAGGTAGGTT 4320 

GTATTCATGC TTTAAAGCCA GGCTATCCTA TTGATGGGCA AAAAGAAAAT ATTTTGGCAG 4380 

CTAAACGGTA TGATGTTTAT AATAATAAAT TTCTATTAGA TGGAACTTTT TTGGGCTACT 4440 

ACAGTGAGGA CACGCTTTTT CACTTGAATC AAATATTGGA AGCTAATAAT TCTAGCTTTA 4500 

TTATTGAAGA TGGTGATTTA GAAATTATGA AGAGAGCTGC ACCTCTTAAT ACGATGTTTG 4560 

GGATGAATTA TTATCGTTCA GAATTTATTC GTGAATACAA AGGTGAAAAT AGACAAGAAT 4 620 

TTAATTCAAC AGGAATAAAA GGACAGTCTT CTTTTAAATT AAATGCTCTA GGTGAATTTG 4680 

TAAAAAAACC TGGTATTCCG ACAACAGATT GGGATTGGAA TATTTATCCT CAAGGGTTAT 4740 

TTGATATGTT GCTTCGTATC AAAGAAGAAT ATCCTCAACA TCCGGTCATT TATTTAACTG 4800 

AAAATGGTAC AGCCCTTAAA GAAGTTAAGC CAGAGGGCGA GAATGATATT ATTGATGACA 4860 

GTAAGAGAAT CCGTTATATT GAGCAACATT TACACAAAGT TTTAGAGGCT CGAGATAGAG 4920 

GAGTCAATAT TCAAGGCTAT TTTATATGGT CTTTGCAAGA TCAATTTTCT TGGGCGAATG 4980 

GCTACAATAA GCGATATGGT CTTTTCTTTG TTGATTATGA AACACAGAAG AGATATATTA 5040 

AGAAAAGTGC TCTTTGGGTA AAAGGGCTAA AACGGAATTA AGGTTAGCGA TTTGACTGAT 5100 

GTTTAATATG TTTTAAATAT GAGGTTGAAT TTTTTATAGG AGGAGTTTTA TGGATAAGCT 5160 
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AGTCGCTGCC ATTGAAAAGC AACAAGGGAA ATTTGAAAAA ATTTCTACTA ATAACTATAT 5220 

GATGGCTATT AAAGATGGAT TCATTGCTAC TATGCCTTTA ATTATGTTTT CAAGCTTTTT 52 BO 

GATGATTATT ATTATGATTC CTAAAAATTT CGGAGTAGAG TTACCGAGTC CAGCTATTGT 5340 

CTGGATGAGA AAAGTGTATA TGTTAACCAT GGGAGTTTTG GGTATTATTG TTTCAGGGAC 5400 

TGTTGGAAAG TCATTAGTTG GAAATGTTAA CAGAAAAATG CCTCACGGAA AGGTAATAAA 5460 

TGATATTTCT GCAATGTTGG CAGCCATATG TAGTTATCTG GTATTAACTG TAACGCTTGT 5520 

AGTTGATGAG AAGACGGGAT CTACAAGTTT GTCGACAAAC TATTTAGGAT CTCAAGGATT 5580 

GATAACTTCG TTTGTCAGTG CCTTTATTAC TGTAAATGTT TACCGATTCT GTATTAAGCG 5640 

AGACATTACT ATTCATTTAC CTAAGGAAGT TCCTGGGGCT ATATCACAAG CTTTTAGAGA 5700 

TATTTTCCCT TTTTCTTTTG TTTTACTTAT TAGTGGTTTG TTAGATATTG TATCTCGGTT 5760 

TAGTTTAGAT GTTCCTTTTG CCCAAGTATT TCAACAACTA TTGACTCCTA TTTTTAAGGG 5820 

GGCAGAATCA TATCCTGCTA TGATGTTGAT TTGGTTTATG TGTGCTTTGC TTTGGTTTGT 5880 

TGGAATTCAT GGACCATCTA TTGTCTTACC TGCTGTTACA GCTTTGCAAC TGAGCAATAT 5940 

GGAAGAGAAT GCTCAACTTC TTGCAAATGG GCAGTTCCCT TATCATTCTT TAACACCTAA 6000 

TTTCGGGAAT TATATCGCTG CTATTGGAGG AACGGGGGCT ACCTTTGTTG TACCATTTAT 6060 

TTTGATTTTC TTTATGCGGT CTAAACAATT AAAATCGGTA GGTAAAGCTA CAATTACTCC 6120 

TGTTTTATTT GCGGTAAATG AACCTCTTCT ATTTGGTATG CCTGTTATTT TGAATCCCTA 6180 

TCTTTTTGTC CCTTTTTTGA TGACTCCACC AGTGAATGTA TTTCTAGGAA AGGTCTTTAT 6240 

TGATTTCTTT GGAATGAATG GATTTTATAT CCAGTTACCT TGGACCTTTC CTGGTCCCTT 6300 

GGGATTGTTA ATTGGAACGA ATTTTCAACT TATCTCCTTT GTATTTTTAT CTTTGATTTT 6360 

AGTTGTCGAC ATATTGATTT ATTTGCCATT CTGTAGAGCG TATGATAGAC AGTTACTGGT 6420 

GAAAGAAGAT ATTGCAAGCT CAAATGATAT TATTTTAGAG GAGGATACAA GTGAAATAAT 6480 

TCCTGGTGAG ATAGATGAAA TAAAAAGTAA GGAGTTGAAA GTACTGGTTC TTTGTGCAGG 6540 

GTCTGGAACA AGTGCGCAAT TAGCCAATGC AATTAACGAG GGGGCTAACT TAACAGAGGT 6600 

TAGAGTGATT GCGAATTCAG GAGCGTACGG AGCTCATTAT GATATTATGG GTGTTTATGA 6660 

TTTAATTATT CTGGCCCCAC AAGTTCGGAG TTATTATAGA GAGATGAAGG TGGATGCAGA 6720 

AAGATTAGGT ATTCAGATAG TTGCTACCAG AGGAATGGAA TATATTCATT TAACAAAGAG 6780 

TCCAAGTAAA GCCTTACAAT TTGTATTGGA GCATTACCAA GCTGTGTAGT AAGTTTTTCG 6840 

ATCTTTTATT TGAGTAAAGA TTTTGTTTAC AGATAGGCTT GGATTTAAAA ACGTTCCCCC 6900 
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TTTTTTAATA 


. TAAGAATCCC 


TCTTTCACAA 


TTGTAAAAAG 


AGGGATTTTG 


TATTTTATCT 


6960 


CTTAGACCAA 


GTTCTCTTCA 


TAAAGAGAAG 


GAGGATTGGG 


TAAATCTCCA AGCGCCCTGC 


7020 


AATCATTGCA AAGGATAGGA GAATTTTTGA 


GATGGGACTA 


AAGATTGAGA 


AACTAGAAGT 


7080 


GGTTCCTAGA ATAGGCCCGA TATTATTGAA ACAGCTAAAG 


ACAGCGCTGG 


TCACGACCAG 


7140 


AAAATCATTG CTATCTAGGC TGACAATAAA GATAAGCGCT AGCAAAATCA 


T AGC AT AG AT 


7200 


GACAAAGTAC 


TTGAGAATCT 


TATGCTGGGT 


ATCTTTGTCA 


ATCACCGTTT 


TATTAAPATfl 


7260 


GAGGGTCAAA 


ACACGGTGGG 


GCGATAGGAT 


TGACAAAATT 


TGGTTTTTGG 


C AA TTT*! "I HZ A 


7320 


AAGGATGAGG 


CCTCGAATAA 


TCTTGAGTCC 


ACCTGCAGTT 


GATCCAGCAG 


AfWA ppr » n* 


7380 


TGCCATGAGG 


AAAAGGAGGA 


TAAACTGGGA 


GAAGAGGGGC 


CAGTTGGTAA 


IniL 1 A 1 A 


7440 


TCCAAAACCA 


GTTGTTGTAA 


TGATGTTGGA 


AACCTGGAAG 


AAGGTCATTT 


pa A Anr* , 7v >| T> r r 


7500 


TGAAAACCCT 


GGGTAGAGGT 


AGAGGGTGTT 


GAGGCTAATC 


AAGCCTGTAG 


nnt\\*S*. Alj i At, 


7560 


AATGACCAAG 


TAAGCCCTAA 


GCTCTTCATC 


TCCAAAGAAG 


GCCTTGATGC 




7620 


GAGGTAGTAG 


TAGAGGTTGA 


AATTTACTCC 


AAAAACCAGA 


ACTCCGATAC 


r PCl app hfl A T a 


7680 


GGTAATCAGT 


GAGCTGCCAT 


AGTGGGCAAT 


TCCGTCGTTA 


TAGACGGTAA 


/MJ V-V. 1\» 1^ A\J 1 


7740 


TCCCGCTGTC 


CCCATAGCAA 


TAACAAAACT 


ATCGTAGAGA 


GGCATACCGG 


PTAP.ATA A'Pa 
V» 1 no A 1 AA 1 A 


7800 


GATGATGACA AAGAGGGAGA AGAGAGCTAG ATAAAGGAGA TAGAGAATCT 




7860 


TTTTAGTTTG 


GATACAACCT 


TGCCAAAAAC 


AGGACCTGGA 


ACCTCAGCCT 


TP ATT" A prnv 1 


7920 


TAGGTGGCTA 


TTTTTGGCAT 


TGTCCATAAT 


AGCAAGTGCA AAAACAAGCA 




7980 


TCCAATCAAG 


TGGGTAAAAC 


TTCGCCAGAA 


GAGGAGGGAA 


GGGCTGAGAA 


CCGAAACGTC 


8040 


GTTCAAAATA 


CTTGCTCCAG 


TAGTTGTAAA 


TCCAGAACTA 


ATTTCAAAAA 


AGGCATCAAT 


8100 


AAGGCTGGGG 


ATTTGCCCAG 


AAAAGACAAA 


GGGGAGACCA 


CCAAAGAAAG 


ACCAAAGGAT 


8160 


CCAACAGAGG 


GCAACGATCA 


AGACTCCCTC 


CTTGGCATAA 


ATCCGTTGAT 


TTTTTGGCTT 


8220 


CTGTAAACTC 


CCTGAACCGC 


CTAACAATAC 


GAGAATCCCT 


ATGGTCGAAA 


AGAGGGCTGT 


8280 


AAAGACTTGG 


CTCGATTCAC GGTAATAGAC 


AGCAATCGCA 


ACAGGAACCA 


AAAGAAGAAG 


8340 


AGCTTCAATC 


AAAAGTAATT 


TTGAAAGGAG 


GTAACGAATC 


ATACTTTTAT 


TCATTTCTTA 


8400 


CCTCGCGATC 


AAGTCATAAA 


TCTTGGTGAT 


GTTTGGCAAC 


AAGGTTGTTA 


CTAGGAGCTT 


8460 


GTCTCCAACT 


TCCAACATAT 


CCTCCCCAGT 


TGGGAAAATA 


GTCTTGCCCT 


TTCGAATAAT " 


8520 


GGCTGCAATA 


AGAACCCCTT 


TTTTCAATTT 


CAGTTGAGAA 


AGAGGTTTGG 


CAGTCATTTT 


8580 


ATTGGCTTCC 


TTGATATGGA ATTGCAGGGT TTCGATTTGG CCATTGGCTA GATGGTGCAT 


8640 


AGCTTGAAGG 


TCTGAATACT 


GGGCATTAAC 


TCGACCACGA 


ATAAAGTGCA 


TAATCGTATC 


8700 
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TACAGCGATG CTTTTAGGTG TGATGATACT TGAAAAATCA GGCGCATTGA TAATCTCGAG 8760 

GAGACTGGTA CGATTGACCT TAGTAATATT TTTCTGTACA CCTACCCTGT CAAGGAACAT 8B20 

AGATGTAATC AGATTTTCCT CATCGACTCC TGTTAGAGTC GCAACGGCAT CATAGTGTTG 8880 

AGCACTTTCT TCCAGCAGGA TATCTTTTGC GGTTCCATCT CCTTGAACGA TGTAGAGATT 8940 

TGGGAATTTC TCGCTAAAGA AGCTGGCGAT TTCAGGATTG ATTTCAATGA CTTTTGTATC 9000 

GATACGACTA TCTTTGAGAA TACCAAGTAG ATAATAGGCA ATTCTACCTG CCCCAACGAT 9060 

GAGAAGGCTC TTCACGGCGC GTGATTTAAA ATAATTATGG AAGAGTATCA TATCGACACG 9120 

GTTACCAGTG ACAAAGATTC TATCTTTATC CTGTACAGTC ATGTCACCGC TTGGAATGAT 9180 

AATTTGATGA TCCCTCTCTA TCGCACAGAC AATGACATTA CCAAATTTTT TACGAAAATC 9240 

AGAAATGGGC ATTTGGCAAA GACCGCTGGT GGACTTGACG ACAAATTCCA TGAGGCTAAC 9300 

GCGTCCACCA GCAAAGCGTT CGACAGACAG GGCGTTGGGG AAGTCAATGA TATTCGCGAT 9360 

AGCGCGGGCA GCCAAGAGCT CAGGATTAAC GATAAGAGAA AAACCGAGAA TATTCTTTTC 9420 

CTTGAAATAA GAGTTAGAAT ATTCAGGGTT CCGCACCCGA ACGATAGTTT CTTTAGCTCG 9480 

CATTTTCTTG GCTAGAACTG CTGCAATCAT GTTGACTTCA TCGTGCTCAG TCAGGGCGAT 9540 

AAAGATATCA GAATCTTGGA GGCTGGCTTG CTCAAGAATG GCAAAATCGG CCCCGTTACC 9600 

AAGGATACCA ATGATATCAA AGCGACTGAC AATATGATTG AGAACAGCTT CGTCTTGCTC 9660 

AATCAGCAAA ACATCATGGT TTTCTGCAAC CAAGGAGCGA CAGAGGGCAA AACCAACTTT 9720 

TCCCCCTCCG ACAAGGATAA TTTTCATAAT AAAACCTACT TTTTCATGAT GTAACTATCA 9780 

TACCCTTTTT CAAGAAAAAA TGCACCTACT AGGTAATAAC AAGAGTTTTT AGTGAAAATT 9840 

CGCTATAAGG TAAAACTATA CCCTAACCAA TTGAAATAGC TATTAGCGAC TTTCTCTGAA 9900 

ATATGGTATG ATAAAGGATA TACAAGGAGA TAAAATGAAT AATAATTTAC TGGTATTACA 9960 

ATCAGACTTT GGTCTGGTTG ATGGTGCGGT ATGGGCTATG ATTGGAGTGG CTTTAGAAGA 10020 

GTCTCCAACC TTAAAAATAC ATCACTTGAC GCACGATATC ACGCCTTATA ATATTTTTGA 10080 

GGGGAGCTAT CGTCTCTTTC AGACGGTGGA TTACTGGCCT GAGGGAACGA CGTTTGTATC 10140 

GGTTGTCGAT CCAGGTGTCG GTTCGAAACG TAAGAGTGTA GTTGCCAAGA CTGCAAAAAA 10200 

TCAATACATT GTCACGCCAG ATAATGGGAC GCTTTCCTTT ATCAAGAAAC ACGTTGGCAT 10260 

TGTAGCCATT CGTGAGATTT CTGAGGTGGC CAATAGGCGT CAAAACACAG AGCATTCTTA 10320 

TACCTTCCAC GGTCGTGATG TCTATGCCTA TACTGGTGCT AAACTGGCGA GTGGTGACAT 10380 

TACTTTTGAG GAAGTAGGGC CAGAGCTCAG TGTGGAACAG ATTGTAGAGC TTCCAGTCGT 10440 
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AG CG AC CATC ATAGAAGATC ATCTGGTGAA GGGAGCCATT GATATTCTGG ATGTGCGTTT 10500 

CGGTTCGCTT TGGACCTCTA TCACACGGGA AGAATTTTAC AAGCTGGAAC GAGAATTTGG 10560 

TGATCGTTTT GAAGTGACCA TCTATCATGC TGATATGCTG GTCTATCAAA ATCAGGTTGT 10620 

CTATGGCAAA TCATTTGCAG ATGTGAGAAT TGGGCAACCs ATcTTTACrc TCAGCaTCTt 10680 

CGATTAGCTG GGCAATTCGT TCTAGTTGGA TTTCGTCAAT CAAGGT 10726 
(2) INFORMATION FOR SEQ ID NO: 67: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7163 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
{D) TOPOLOGY: linear 



(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


67: 






TTATCTTTAA 


CGATATCAAT 


CAAGATCTGG TCAATAAAGG GATTGGGGCT TATCGTGAAG 


60 


TTGGCATCCA AGCCCATGGA TATGTCTGTG ACGTGACAGA 


CGAGGACGGT 


ATCCAAGCCA 


120 


TGGTCAAGCA AATCGAACAA 


GAGGTTGGTG 


TCATTGACAT 


CCTCGTTAAT 


AACGCTGGTA 


180 


TTATCCGCCG 


AGTTCCAATG 


TGCGAAATGA 


GCGCCGCTGA 


TTTCCGTAAG 


GTCATCGATA 


240 


TTGACTTAAA 


CGCACCATTT 


ATCGTTTCAA 


AGGCAGTTAT 


TCCTTCTATG 


ATAAAGAAAG 


300 


GGCATGGAAA 


GATTATCAAT 


ATTTGTTCGA 


TGATGAGCGA 


ACTGGGACGT 


GAAACAGTTA 


360 


GCGCTTATGC 


TGCTGCTAAA 


GGGGGCTTGA 


AAATGTTGAC 


CCGCAACATT 


GCGTCTGAAT 


420 


ACGGTGGAGC 


CAATATCCAA 


TGTAACGGAA 


TTGGACCGGG 


TTATATTGCC 


ACTCCTCAAA 


480 


CAGCACCTCT 


TCGTGAATTG 


CAAGAAGATG 


GTTCTCGCCA 


CCCATTTGAC 


CAGTTCATCA 


540 


TTGCAAAAAC 


ACCTGCTGCA 


CGTTGGGGAA 


ATACTGAAGA 


TTTGATGGGC 


CCTGCTGTCT 


600 


TTCTCGCTAG 


TGATGCCAGC 


AATTTTGTCA 


ATGGCCACAT 


CCT ATATGTA 


GATGGCGGTA 


660 


TCTTAGCCTA 


CATCGGAAAA 


CAACCTGAGT 


AAAAATAGAA 


AGAAGATCTT 


ATGAAAATCG 


720 


CATTAATCAA 


TGAAAATAGT 


CAAGCTAGCA 


AGAATCACAT 


TATTTACGAT 


AGTCTAAAAG 


780 


AAGCGACAGA 


TAAAAAAGGC 


TACCAATTAT 


TTAACTATGG 


TATGCGTGGA 


GAAGAAGGAG 


840 


AAAGTCAATT 


AACTTATGTG 


CAGAACGGAC 


TAATGGCTGC 


CATCCTTTTA 


AATACAAAGG 


900 


CAGTTGACTT 


TGTTGTTACC 


GGCTGTGGTA 


CGGGTGTAGG 


GGCTATGCTT 


GCTTTAAACA 


960 


GCTTCCCTGG 


TGTTGTCTGT 


GGTCTAGCAG 


TGGACCCAAC 


TGACGCTTAC 


CTTTATTCTC 


1020 


AAATCAATGG 


TGGTAACGCC 


TTGTCTATCC 


CTTATGCCAA 


AGGATTTGGC 


TGGGGGGCAG 


1080 


AACTGACCCT 


CAAATTGATG 


TTTGAACGCT 


TATTTGCTGA 


AGAAATGGGC 


GGTGGCTACC 


1140 
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CAAGAGAACG TGTAATCCCT GAACAACGCA ACGCTCGTAT CTTAAACGAG GTGAAACAAA 1200 

TCACCCACAA TGATTTGATG ACCATCCTTA AAATAATCGA CCAAGACTTC CTCAAAGACA 1260 

CCATCTCTGG CAAATACTTC CAAGAATACT TCTTTGAAAA CTGCCAAGAT GATGAAGTTG 1320 

CTGCTTATTT GAAAGAAGTA TTAGCCAAGT AAAGCTATTC TAAACCAGAA AGGAACTAAT 1380 

GGATGACGAA AATATTACTG TTTGGCGAAC CATTAATTCG AATTTCACCA TTAGATGCCA 1440 

CCAGTATCGG CGATCATGTT GCCAGTTCGA CTTATTTTGG CGGATCAGAA ATTAACATCG 1500 

CTTGTAATTT GCAAGCCCTG GGTATCTCAA CGAAAGTTTT TACCGCACTC CCTGCCAACG 1560 

AGATTGGAGA TCGTTTTCTC ACATTCTTGA AACAGCACCA AATCGATACC AGTTCAATCT 1620 

GTCGGCTTGG CGATCGAATC GGCCTCTACT ATTTGGAGAA CGGCTTTGGT TGTCGTCAAA 1680 

GTGAAGTTTT CTACGATCGT AAGCATACGA GTATCAGCCA GATTCGGCCA AACATGCTAG 1740 

ATATGGATTC TCTCTTTCAG GGGATTAGCC ATTTTCATTT TAGTGGAATC ACCGTAGCTA 1800 

TCGGTCAAGA GGTCCGTGCG ATCCTTCTCC TACTCTTGGA AGAAGCCAAG CGCCGAGGAA I860 

TTGTCGTTTC AATGGATCTC AATCTGAGAA CAAAGATGAT TTCAGTCCTA GAAGCCAAGT 1920 

ATGAATTTTC TAAGTTTGCA CGTTTTACTG ACTATTGCTT CGGTATTGAT CCTCTCATGA 1980 

TTGATGACCA AAATCTAGAG ATGTTTCCAA GAGACAGTGC TAGCCTAGAA GAGGTGGAAA 2040 

ATCGCATGCG ACTTTTAAAA GAAGCCTATG GTTTCAAGGC CATTTTCGAT ACCCTCCGCT 2100 

CTAGTGATGA GCAAGACAAA AATGTCTATC AAGCCTATGC TCTAGAAGAA CTATTTGAAG 2160 

AGTCTGTCCA ACTAAAAACT GCAGTCTATC AACGAATTGG TAGCGGGGAT GCCTTTATAT 2220 

CTGGTGCCCT TTACCAACTA CTCCATCATT CCTCCCTAAA AACTACCATT GACTTTGCAG 2280 

TTGCGAGCGC AACTCTCAAA TGCACTCTTC CAGGAGACCA TCTCTCCACT TCCTCAACTA 2340 

GTATTGAAAA TTTACTGGCA AATGCACAAG ATATCATTCG TTAGGAGAAT TACATGACCA 2400 

AATCAGATAC GATTATTGAA CTAAAAAAAC AAAAAATTGT CGCTGTTATT CGAGGAAATA 2460 

CAAAGGAAGA AGGACTACAA GCCTCGATTG CTTGTATCAA GGGCGGTATC AAAGCTATTG 2520 

AAATCGCCTA TACCAATCAG TATGCAGGAC AAATCATCAA GGAACTTGTA GACTTGTATC 2580 

AGGACGATCA GAGTGTTTGT ATCGGTGCAG GTACTGTGCT TGATGCCGTA ACTGCTAGAG 2640 

ATGCCATTCT AGCTGGAGCA AATTACGTTG TTTCTCCATC TTTCCATGCT GAAACTGCGA 2700 

AAATGTGCAA TCTCTACAGC ACACCGTACA TTCCAGGCTG TATTACCCTC ACAGAGATCA 2760 

CGACTGCACT TGAAGCCGGT AGTGAAATCA TCAAACTCTT CCCAGGTAGT AGTCTCAGTC 2820 

CAGCATATAT CTCTGCAGTC AAGGCACCGA TCCGACAAGT TTCCGTAATG GTAACCGGAG 2880 
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GAGTCGGCCT AAACAACATC CCTCAATGGT TCGCTGCTGG 


TGCAGATGCC 


GTTGGAATTG 


2940 


GTGGCGAACT CAATAAACTC GCTTCCCAAG GCAACTTTGA 


CCGCATCAGC 


GAGATTGCCC 


3000 


AACAGTATAT TACACTCAGA TAAAATCATA ACTACCCGTC 


TAACGGGTGG 


TTTATCTCAG 


3060 


AGCTATAAGC CCAAATCATC AGCCAGCGCC TAAAGACGCT 


GGCTTTCACG 


TTGTTCAAGC 


3120 


CTTATTGCTC TTGACTCGTC ACTTGCCTCT TTAAGAGACT 


TTGGTATTAC 


TTACCACTAT 


3180 


CCCTAAAGGG ATCCTCATAT TCTTTTACAC TCAATTTATC 


TAGTGCTATA 


GTAGATTGAA 


3240 


ACTGGAATAG TACACCTCTG CTTCTAAAAC ATTGTTAAAA 


ATCGATTTGA 


CTGTCCTGAT 


3300 


CGATTTTGTC CTGTTCTTAT TTCATTTTAC TATATATCAT 


ACTTTACTCG 


TTCTCAAATT 


3360 


TTCATACTCA TGAAGAAATC ATCCACTCGA TAATTTCTTT 


AATCTTGACT 


ATATTTCTTA 


3420 


ATTGTGGCTT CATTAAGCCC TACTGGACTT ACATAATAAC 


CTTCCTCCCA 


GAAATGCCGA 


3480 


TTCCCAAACT TGTACTTGAG ATTGGCGTGT TTGTCAAACA 


TCATGAGTGC 


ACTTTTGCCT 


3540 


TTTAAATACC CCATAAAACT TGAAACACTT AGCCTCGACG 


GAATACTGAC 


TAACATGTGT 


3600 


ACATGGTCTG GCATTAAGTG ACCCTCGATC ATTTCAACAC 


CTTTATAACT 


ACACAAGCGA 


3660 


TGAAATATTT CGTCTAAACT ACTTCTATAT TGATTATAGA 


TGACTTTTCG 


TCTATACTTA 


3720 


GGGGTGAACA CAATATGATA GAACACCTCC ACTTTGTGTA 


TGATAAACTA 


TGAGTCTTTT 


3780 


GTGCCATATT TTTTCTCCTT TCGCTTTACA ATTGGATTGA 


ACACCTTTAT 


TGTATCGCGT 


3840 


TTGGAGTTTT TTTGGTATAA CCTTCGACGC GCAGCCGTAT 


AGCGGGTGGT 


TGTTTTGTCT 


3900 


CGCACCTCAC GGAGCGAGAC GGACTAATAT AGTGGAGTGA 


AATAGGATAC 


GAACAAATTG 


3960 


ATTAGGAAAA TCAAATGAAT TTATAGAAAT CTTTTAGCAG 


TTATAACGTT 


CTATTCTAGT 


4020 


TTCAAAACGC TATAGTCACA TAATAATGAA GTAAAAAAGG 


ATAAGTATCA 


ACTTATCCTT 


4080 


TTTTAAAAGA AAAATCCGAA GATATTTGGC CTTCTTCGGA 


TTTTTTCTAT 


TTTCCACAGT 


4140 


TTCATGTAAT TCATCTAGAT GATGAACAAA TTAGTTGTTC 


TTTCCTCTAC 


GGAATAGATA 


4200 


AAATGCCCCA AGTAGCAAGA ACCCTAGACT TGCCAAGATT 


GACTGACCTT 


CTCCTGTCTG ' 


4260 


AGGGAGATTC TTTTGATCCG AATGGTTCTT TTCCTCTTCA 


GATTTTTCCT 


TTTCTTTTGA 


4320 


ATTCTGTACT TGTGGCTGAG CTGCTTGCTC TAGCTTTTTA AAGACTTCCT GATCTGGAGC 


4380 


TGATTCCTGG GTTTCAGGAT TATAGTAGGC AATCTTATAT 


TCATCCCCTT 


CTTTTCGAAT 


4440 


GGTATAGACT CCACGTTTCA AAACTTGGAA TTGGTTGGAA 


ATAGTAGAGA 


CAGAATCATC 


4500 


ATATTTCACA ATGCCCCAAA CTCCTTGTTT AGCATCATAA 


ACAGACTGAA 


GGGTTTCGTT 


4560 


ATTTTCGATG AGGCTACTTT CTAACTCTTT TATCATTTGA 


TTGAAGGTGG 


CACGATCCAC 


4620 


GTTAGGAATG AGCATATAGC CATAAGAATC TCTATTTTGC 


TTATGAGCCT 


GACTAATCGT 


4680 



WO 98/18931 



PCT/US97/19588 



555 

AAGAAATTCA TTTTCAACTT CCTTGTCTGA CTGTCCTTCA TTGATATCCT TCCAGGCTCC 4740 

CTTTTGCAAA GCCTTACTCA TACTGATTGA ACTCTTCTTA AAGAAAAAGT AACCAATATT 4800 

CTTTTTCGAA TCGAACGATT CTAAAAAGAC ACTTTGGGTT TCAGGATAAT CCTTTTCTTG 4860 

TTCTGTAAGG GAGGCTTCTT TATCATTGAC ATAGACTTTA TATGGATTAC CTGATTCCAG 4920 

TTTTCTCTGG TCAATTGTAG TTGCAGCAGT ATCTGTTGAA GTGTTTTGGA TATTGCTTCC 4980 

TAAAAAGGCG ATCTTATCCT TTAGCATAAA CCAGCTCTTA TGAGCAGTCA ATGTTTGATT 5040 

CCAGTTGGTG AAATCCATGG TTGCTGTCGC ATTGGCATCA TCTAGTTTGC TCGTTCCAAC 5100 

GAAAGCAGAC GGTAAAACTT TACCTGTATC GCTATCCGCT CTCTTAGCAT CCGTCTCTGT 5160 

TGTACCAGGC ATCTTATATG GATTAACTGT TGGCCAGTAG CCATCGCTAT AGTGACTCAA 5220 

ATCGCCATTG TAAAGATAGA ACATCCCATC ACTCGTATAC CAACCACGTT TATTTTCCTT 5280 

GTTCATGTGT TCGTAATTCA AGGTACGACT GGAAAAGAGT GACAAGCCAA ATCCAAACCC 5340 

TTTCTCTGCA TTGTACATGG CTGTTTTATC CATCTTGTTA AAGGCAGATA GGTAACTTGG 5400 

TCTTGGAACA CTTGCGACTC CTGCATCACT TAACAAGGAT TGCATCAAAC TGATATCCTT 5460 

ATAAGTCTTC AAATTCTTAA AGACATCATA ATAACTATCC GATTGAACAA TGGTCTTCAC 5520 

AAGACTCTGC AAACATTGTT TGGTTTCTCC TTCAGACATA TCCGCTATTC GGTGAATCCC 5580 

TCTTAGTACT TCTACTGCGG CCACGTGCCC CTCGCTATTT GCACGACTGA TCGAGCGTCC 5640 

ACGACTCATA TCCATCAACT CTCCATTCAC CAGCAAAGGA GCAAACGATT TATCAATCCA 5700 

GTGGTACATG GTTTGCATTT TATCTTTATC GATTGGATTC TTGGTCTTTT GAATGACTGG 5760 

CAACAGTTGA GACAGGCCAT CAATCAAAAC ATTCCCATAA GCACCCGTAT AGGCAACATT 5820 

GGTGTGGTCG ATATAGGATC CATCTTGATA AAAACCTTCA CCTTGGTCTA CCAACTTGAA 5880 

CACTTGCTCA ATCGAGCGAA TGGTAGAAGA AATTTCTTGA TCATCCTTAC GCAGTAAACC 5940 

AGCTATTACT TTTACCCTTC CCATATCAAC TAAGTTTCCA CCTAGAGCCT TGAATGGGTT 6000 

ATCAGTCGTC TTTCGGAAAT GTTCGGGATC TGGTACAAAT TTTTCAATCA CATCTGTATA 6060 

TTTTTTAATT TCCTCATCAG AGAAGTATTC TTTCATCAGA GACAAGGTAT TGTTGATGGC 6120 

ACGAGGTGTA CCGATTTCAT AATCCCACCA GTTCCCAACA ATGCTCTTTT CACTATTGTA 6180 

GACATGTTTA TGCATCCATT CCATGGAATC CCTGACTGTT CGAACGACAG TTTCATCTTG 6240 

ATAATAACGA GAAGAAGGAT TGGTCACTTG CTTGGCCATC TCCTCCAATT TCCGATAAGT 6300 

GGCAGTCAGA TTTGCAGACG TTTTATAATT TGAAAATTTT TCCCACAAAT AGGTGCGGTC 6360 

CGCCTGACTT GAAATACTGG ATAGGCTATC AGCTACCTTT CCTTCCAATT CCTGGTTTAA 6420 
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TTTGGCCATC TGTTCATTTT TAGAATCATA GTATTGATTC CCAGCGATGA TGCCATTCCA 6480 

GTCATCCAAA CGGTCTGTGT ATGCATCCTT AACAGAGGCC AGAATCTTCA AAGGAATCTT 6540 

TTTCACTTCC TTGCCATCTT TACTGACAAT GACATTGGTT GTCCCTTCCT TAAGAGGTTC 6600 

TAAAATTCCA TTTTTGACTG AAGCAACGTC AGGATTTTCT ACCTTATAAG TATAGTCCGC 6660 

AAGAGAAAAA ACATGTTTTT TTCCAATTGG TAAATCAATC TTTTCCTCAA GCTGTTTATC 6720 

TGTTTGAGAA TCCTCAGAAA GCTGGTCTGC TACCTCTACC AGCTCAATAT CCTTAAAGGA 6780 

AACAGTCCCA GTTCCTGTTT CATAGAATAA CTCCAGCTTG ATTTTATCAA CATCTAAAGT 6840 

CGGGCTATAG TCTGCTTCAA TGGTCTGCCA GTCCTTTGTT GCTGACGTCG TTGCAGAATT 6900 

CCACAATCGC TTGTCCTTAC CACTTTCCTC AATGATACGA ACTTTGGCAA TCCCGATTTT 6960 

ATTATCTGTT TTAATCTTGA AACGCAGTTT ATACTTTTTC TTAGCTTCAA TAGGAACCAT 7020 

ACGGTGAAGC GCTGCCCTTA ATTTCTCATG GCTTGAGATA GTGATAGCGC CATCCTTAGC 7080 

CTCAATGACT CGAGTTGAGG CATCTGCACT ATTCTTCTGG TCTACCCAAG CTGACCACCC 7140 

CCTGAGCTTT GCTTCCTGTC CGG 7163 
(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9244 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

CGTTATAACA TACATGTAAG CGGTACCCAA AATGGTGCCA AGTCAAAATT TTTAAGGAGG 60 

AAAATACATG TCTTCACATC CAATTCAGGT CTTCTCAGAA ATTGGGAAAC TGAAAAAAGT 120 

TATGTTGCAC CGTCCAGGCA AGGAGTTAGA AAACTTGTTG CCGGACTATC TTGAAAGGCT 180 

TCTTTTTGAT GATATTCCTT TCTTGGAAGA TGCTCAAAAA GAACATGATG CATTTGCCCA 240 

AGCTCTTCGC GATGAAGGAA TTGAGGTTCT CTACCTAGAA CAACTCGCTG CTGAATCATT 300 

GACCTCTCCA GAAATCCGCG ATCAATTTAT CGAGGAATAC TTAGACGAAG CCAACATCCG 360 

TGATCGTCAA ACCAAGGTTG CTATTCGTGA ATTGCTTCAC GGCATCAAGG ACAACCAAGA 420 

ATTGGTTGAA AAAACAATGG CTGGGATTCA AAAAGTTGAA TTGCCAGAAA TTCCTGACGA 480 

AGCTAAAGAT CTAACTGACT TAGTTGAATC AGAGTATCCA TTTGCAATTG ACCCGATGCC 540 

AAACCTCTAT TTCACTCGCG ACCCATTTGC AACAATTGGA AACGCCGTAT CGCTTAACCA 600 

CATGTTTGCA GACACTCGTA ACCGTGAAAC ACTCTACGGT AAGTATATCT TCAAATACCA 660 
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CCCAATCTAT GGCGGAAAAG TGGATTTGGT 


CTACAACCGT 


GAAGAAGATA 


CGCGTATCGA 


720 


AGGTGGAGAC GAGTTAGTTC TTTCTAAAGA CGTCCTTGCA GTAGGTATCT CTCAACGTAC 


780 


AGACGCAGCT 


TCTATCGAAA AACTTTTGGT 


CAACATCTTC 


AAGAAAAATG 


TTGGCTTCAA 


840 


GAAAGTTTTG GCCTTTGAAT TTGCTAACAA 


CCGTAAATTC 


ATGCACTTGG ATACTGTCTT 


900 


CACTATGGTA 


GACTATGACA AGTTCACTAT 


TCACCCAGAA 


ATCGAAGGCG 


ACCTTCACGT 


960 


TTACTCAGTT 


ACTTACGAAA ACGAAAAACT 


TAAAATCGTT 


GAAGAGAAAG 


GTGACTTAGC 


1020 


TGAACTTCTT 


GCTCAAAACC TTGGTGTAGA 


AAAAGTTCAT 


TTGATTCGTT 


GCGGTGGTGG 


1080 


CAATATCGTA 


GCAGCTGCGC GTGAACAATG 


GAACGACGGT 


TCTAACACTT 


TGAGCATGGC 


1140' 


ACCTGGTGTG 


GTAGTTGTTT ATGACCGCAA 


TACCGTGACC 


AATAAGATTT 


TGGAAGAATA 


1200 


CGGGCTTCGC 


TTGATTAAGA TTCGCGGAAG 


TGAATTGGTT 


CGGGGCCGTG 


GTGGACCTCG 


1260 


TTGTATGTCT 


ATGCCATTTG AACGTGAAGA 


AGTGTAATCG 


CTGTTCGATA 


TTCGTCAATA 


1320 


GAAAATGTAA 


AAAATAGAAA GAGGAAATAA 


TAAAATGACA 


AATTCAGTAT 


TCCAAGGACG 


1380 


CAGCTTCTTA 


GCAGAAAAAG ACTTTACCCG 


TGCAGAGTTA 


GAATACCTTA 


TTGGTCTTTC 


1440 


AGCTCACTTG AAAGATTTGA AAAAACGCAA TATTCAACAC CACTACCTTG CTGGCAAGAA 


1500 


TATCGCTCTC 


CTATTTGAAA AAACATCTAC 


TCGTACTCGT 


GCAGCCTTTA 


CAACTGCGGC 


1560 


TATCGACCTT 


GGTGCTCACC CAGAATACCT 


CGGAGCAAAT 


GATATTCAGT 


TGGGTAAAAA 


1620 


AGAATCTACT 


GAAGATACTG CTAAAGTATT 


GGGACGTATG 


TTTGACGGGA 


TTGAATTCCG 


1680 


CGGATTCAGC 


CAACGTATGG TTGAAGAATT 


GGCAGAATTC 


TCAGGCGTTC 


CAGTATGGAA 


1740 


CGGTCTAACT 


GACGAATGGC ACCCAACTCA AATGCTCGCT GACTACTTGA CTGTTCAAGA 


1800 


AAACTTCGGT 


CGCTTGGAAG GCTTGACATT GGTATACTGT GGTGATGGAC GTAACAACGT 


1860 


TGCCAACAGC TTGCTCGTAA CAGGTGCTAT CCTTGGTGTC AATGTTCACA TCTTCTCACC 


1920 


AAAAGAACTC TTCCCAGAAA AAGAAATCGT TGAATTGGCA GAAGGATTTG CTAAAGAAAG 


1980 


TGGCGCACAT 


GTTCTCATCA CTGAAGATGC 


TGATGAAGCA 


GTTAAAGATG 


CAGACGTTCT 


2040 


TTACACAGAC GTTTGGGTAT CAATGGGTGA AGAAGACAAA TTCGCAGAAC GTGTAGCTCT 


2100 


TCTTAAACCT 


TACCAAGTCA ATATGGACTT 


AGTTAAAAAA 


GCAGGCAATG 


AAAACTTGAT 


2160 


CTTCCTACAC 


TGCTTGCCAG CATTCCACGA 


TACTCACACT 


GTTTATGGTA 


AAGACGTTGC 


2220 


TGAAAAATTT 


GGTGTAGAAG AAATGGAAGT AACAGACGAA GTCTTCCGCA GCAAGTACGC 


2280 


TCGCCACTTC GATCAAGCAG AAAACCGTAT GCACACTATC AAAGCTGTTA TGGCTGCTAC 


2340 


ACTTGGTAAC CTTTATATTC CTAAAGTATA ATTTTAGATA ATAAACCGTC TACCAACAGG 


2400 
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TATGAGGGCT GCGACTAATA GCTTTAGTCC GGTCCTCTTT TATGTAATGG TAATCTATTA 2460 

TTTCTTATAA AATATGTGAA AAATCATTAA ATTGAAATCT AAACGCATTC TATTGAGTGT 2520 

GATAAAGGAG AATTTATGGC AAATCGTAAA ATTGTAGTAG CTTTGGGAGG AAATGCGATT 2580 

CTTTCTTCTG ACCCATCAGC AAAGGCTCAA CAAGAAGCTT TAGTTGAAAC AGCTAAGCAT 2640 

CTTGTAAAAT TGATTAAAAA TGGAGATGAT CTGATTATCA CTCACGGTAA TGGACCTCAA 2700 

GTTGGGAATC TCTTGCTCCA ACATTTGGCA TCAGACTCTG AAAAGAACCC TGCCTTCCCA 2760 

CTCGACTCAC TTGTCGCTAT GACAGAAGGT AGCATCGGTT TCTGGTTGAA AAATGCTTTG 2820 

CAAAATGCTC TCTTGGATGA AGGCATCGAA AAAAATGTTG CCTCTGTTGT AACGCAAGTT 2880 ' 

GTCGTAGATA AAAATGATCC AGCTTTTGTT AACTTGAGTA AACGAATCGG TCCTTTCTAT 2940 

TCAGAAGAAG AAGCAAAAGC AGAAGCCGAA AAAAGCGGAG CGACTTTCAA GGAAGATGCT 3000 

GGCCGTGGCT GGCGTAAGGT CGTTGCCTCA CCAAAACCTG TTGACATCAA AGAAATTGAA 3060 

ACCATCCGTA CTCTTTTAAA TAATGGTCAA GTCGTCGTAG CTGCAGGTGG TGGCGGTATT 3120 

CCCGTCGTCA AAGAAAACAA TGGACATTTG ACTGGTGTCG AAGCGGTTAT TGATAAAGAC 3180 

TTCGCTTCCC AACGTTTGGC AGAATTGGTT GATGCAGACC TCTTCATCGT TTTGACAGGT 3240 

GTAGATTATG TATTTGTTAA CTACAACAAG CCAAACCAGG AAAAATTGGA ACATGTGAAT 3300 

GTTGCCCAGC TGGAAGAATA TATCAAACAA GATCAGTTTG CACCAGGTAG CATGCTTCCA 3360 

AAAGTAGAAG CAGCTATCGC TTTTGTCAAT GGTCGTCCAG AAGGAAAAGC AGTTATTACT 3420 

TCCCTTGAAA ATCTAGGCGC CTTGATTGAA TCTGAAAGCG GAACAATTAT TGAAAAAGGA 3480 

TAAGTTGTTT TACTAATAAG ATGTATTCTA TTTCTAGTAT CTTTATATCA AATTAGAAAT 3540 

. TATTCTTGAA AACATGTACA ATATTTCAAA AGATACTAGT TTTAGACTTT AATATGGTAA 3600 

AACAAATATA AATAGAAAGC GTTTTCTTGA ATGTTTATTT AAGAAAGTAG TTGGTTTTTT 3660 

ACACTTTGTT AGACATCAGG AGGAAAAACA AATGAGTGAA AAAGCTAAAA AAGGGTTTAA 3720 

GATGCCTTCA TCTTACACCG TATTATTGAT AATCATTGCT ATTATGGCAG TGCTAACTTG 3780 

GTTTATCCCT GCGGGGGCCT TTATAGAAGG TATTTACGAG ACTCAGCCTC AAAATCCACA 3840 

AGGGATTTGG GATGTCCTCA TGGCACCGAT TCGGGCTATG CTAGGTACTC ATCCAGAGGA 39Q0 

AGGTTCGCTC ATTAAAGAAA CGAGCGCAGC GATTGATGTA GCCTTCTTCA TCCTTATGGT 3960 

TGGTGGTTTC CTTGGCATTG TCAACAAAAC TGGTGCTCTT GACGTAGGGA TTGCCTCTAT 4020 

CGTGAAGAAG TATAAGGGCC GCGAAAAAAT GTTAATTTTG GTACTGATGC CTTTGTTTGC 4080 

CCTCGGTGGT ACAACTTATG GTATGGGTGA AGAAACAATG GCCTTCTATC CACTCCTTGT 4140 

GCCAGTTATG ATGGCCGTTG GTTTTGATAG CCTGACTGGT GTTGCAATTA TTTTGCTCGG 4200 
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TTCTCAAATC 


GGCTGTTTGG 


CATCTACTCT 


GAATCCATTT 


GCGACAGGTA 


TTGCTTCAGC 


4260 


GACTGCGGGA 


GTTGGTACAG 


GGGACGGTAT 


CGTACTTCGT 


CTGATCTTCT 


GGGTTACCTT 


4320 


GACTGCTCTT AGTACTTGGT TTGTTTACCG TTATGCGGAT 


AAGATTCAAA 


AAGATCCGAC 


4380 


TAAGTCACTG 


GTTTATAGTA 


CTCGCAAAGA 


AGATTTGAAA 


CACTTTAACG 


TAGAAGAATC 


4440 


TTCATCTGTA 


GAATCTACAC 


TTAGCAGCAA 


ACAAAAATCA 


GTTCTCTTCT 


TATTTGTGTT 


4500 


GACATTCATC 


TTGATGGTAT 


TGAGCTTCAT 


TCCATGGACA 


GACCTTGGCG 


TTACCATTTT 


4560 


TGATGACTTT 


AATACTTGGT 


TGACTGGTCT 


TCCAGTTATT 


GGTAATATTG 


TCGGTTCATC 


4620 


TACTTCTGCA 


CTAGGTACTT 


GGTACTTCCC 


AGAAGGCGCA 


ATGCTCTTTG 


CCTTTATGGG 


4680 


TATCCTGATT 


GGTGTTATTT 


ATGGTCTTAA 


AGAAGATAAG 


ATTATCTCTT 


CCTTCATGAA 


4740 


TGGTGCTGCT 


GACTTGCTCA 


GTGTTGCCTT 


GATCGTAGCG 


ATTGCTCGTG 


GTATTCAAGT 


4800 


TATCATGAAC 


GACGGTATGA 


TTACCGATAC 


AATCCTCAAC 


TGGGGTAAAG 


AAGGCTTGAG 


4860 


CGGTCTATCT 


TCACAAGTCT 


TTATCGTTGT 


AACTTATATC 


TTCTATCTAC 


CTATGTCATT 


4920 


CTTGATCCCA 


TCTTCATCTG 


GTCTTGCCAG 


CGCAACTATG 


GGTATCATGG 


CTCCACTTGG 


4980 


AGAATTTGTA 


AATGTCCGTC 


CTAGCTTGAT 


TATCACTGCT 


TACCAATCTG 


CTTCAGGTGT 


5040 


CTTGAACTTG 


ATTGCACCAA 


CATCTGGTAT 


TGTGATGGGA 


GCTCTTGCAC 


TTGGACGTAT 


5100 


CAACATTGGT 


ACTTGGTGGA 


AATTCATGGG 


CAAACTCGTA 


GTCGCTATTA 


TTGTAGTGAC 


5160 


CATCGCCCTT 


CTTCTCCTTG 


GAACCTTCCT 


TCCATTCCTA 


TAAAATAGTG 


AGTGAGGTGA 


5220 


TTCCATGAAA ATAGATATAA CAAATCAAGT TAAAGATGAA TTTCTTATAT 


CATTAAAAAC 


5280 


CTTGATTTCC 


TATCCTTCAG 


TACTCAATGA 


AGGAGAAAAT 


GGAACACCTT 


TTGGACAAGC 


5340 


AATCCAAGAT 


GTCCTAGAAA 


AAACTTTAGA 


GATTTGTCGA 


GACATAGGTT 


TCACTACCTA 


5400 


TCTTGACCCT 


AAAGGTTATT 


ACGGATATGC 


AGAAATCGGT 


CAGGGAGCAG 


AGCTTCTGGC 


5460 


CATTCTCTGT 


CATTTGGATG 


TTGTTCCATC 


AGGTGATGAA 


GCAGATTGGC 


AGACACCGCC 


5520 


ATTTGAAGCA ACTATCAAAG ACGGCTGGGT ATTCGGACGT 


GGTGTCCAAG 


ATGATAAAGG 


5580 


CCCTTCGCTC 


GCAGCTCTCT 


ATGCAGTAAA 


AAGCTTGCTG 


GACCAAGGTA 


TTCAGTTCAA 


5640 


AAAGCGCGTA 


CGCTTTATCT 


TTGGTACCGA 


TGAGGAAACC 


CTCTGGCGCT 


GCATGGCACG 


5700 


CTACAATACC 


ATCGAAGAAC 


AGGCCAGTAT 


GGGCTTTGCA 


CCTGACTCAT 


CTTTTCCTCT 


5760 


GACCTATGCT 


GAAAAAGGGC 


TTCTACAGGT 


CAAACTTCAT 


GGCCCTGGAT 


CGGATCAACT 


5820 


AGAGCTTGAA 


GTAGGAGGCG 


CCTTTAACGT 


TGTACCAGAC 


AAGGCCAACT 


ACCAAGGTCT 


5880 


CCTCTATGAA 


CAGGTTTGTA 


ACGGTCTCAA 


AGAAGCTGGT 


TATGATTACG 


AAACCACTGA 


5940 
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ACAAACCGTA ACGGTTCTCG gagtrppaaa npaTnr v raar > 


ViA loL r A\j i C 


AAGGTATCAA 


6000 


TGCTGTCATC CGACTAGCPA r*PAT'pr* , P'T¥2P TrT w rr w TV , r , iva 


GAACACCCTG 


CTCTCAGTTT 


6060 


TCTTGCAACA CAAGPAGGTr* 2v AnarvirT'Ar*' R/vRRr»ROJiR 

* ^- * * VJV -* U *^»«» v-nnuunuvj i I*, AAuALAiGLAl. AGGAAGAL. AA 


ATCTTTGGTG 


ATATAGCAGA 


6120 


TGAACCTTCT GGff ArVTAT HCWT^A &(prT rv^tnrriinnv^ 


• ATG ATCAATC 


ATGAACGTTC 


6X80 


TGAAATCCGT ATTfiiraTTT rTv^nnvw nrm Arrair* » r* 
tunnni^vui Hi iunUil 1\. uun<.l\.l,iul \-l TAGCTGAC 


AAGGAAGAAC 


TAGTAGAGTT 


6240 


GPTTAPAAGA *PG r Pf:r*Af , A A & » pmji 0/> JV R /"»m /v>/^^«ni>^Piti[ 

i/«»/uujft 1\j l bLALAAA ACTACCAACT ccgctacgaa 


GAGTTTGACT 


ATCTAGCGCC 


6300 


h-ihiav,\)H» LjCALjAACjAUA gtaaactcgt tagcacactg 


ATGCAAATCT 


ACCAAGAAAA 


6360 


i i>t<jI_ga i AACAGTCCTG CTATTTCATC CGGTGGTGCC 


ACTTTTGCTC 


GCACCATGCC 


6420 


AAATTGTGTA GCCTTCGGCG CCTTATTCCC AGGAGCGAAG 


CAGACAGAAC 


ATCAGGCAAA 


6480 


TGAATGTGCC GTTCTAGAAG ATTTGTACCG TGCTATGGAT 


ATTTATGCCG 


AAGCCGTCTA 


6540 


i lvjAL i ivjUA At- 1 PAAI CAG GCAACTGTTT CTACCAAAAA 


AAATCGACCG 


ATTAATGAAC 


6600 


TGCACCCCAA AAGTTAGACA G AAT AAAT CT AACTTTTGGG 


GTGTTTTATT 


ATGAAATTGA 


6660 


GTT AT G AAGA TAAAGTTCAG ATCTATGAAC TAAGAAAGCA 


AGGACAAAGC 


TTCAAACAGC 


6720 


TTTCAAAAAG ATTTGGTGTG GATGTTTCTG GTCTAAAGTC 


ATCTGAATCT 


TTGAGATGAG 


6780 


CTTTATAAAT CGCTTTTTTC AGTTTTTGCA CTGGTGTTTC 


GATAAACTCA 


AACTTTTTAG 


6840 


CCGTGGTATT GCCTGATTTT ATAGTATATT GAAACTAGAA 


TAGTACACCT 


CTCCTTCTAA 


6900 


AACATTTTTA GAAATCGATT TGACTGTCCT GATCGATTTG 


TCCTGTTCTT 


ATTTCATTTT 


6960 


A<~iAiAiirG AGtCACl rCG TCTTTAACGG CTTT ATT CAT 


AAGCTCTTGT 


AATTTTTCTT 


7020 


T AC T ATCAAT TACTTCTGAT TTTCCGTTGT AATTTATTGT 


AATAGGTTTT 


AACTTACCTA 


7080 


« *■ A 1 v. 1 LUftl, AL.kjV- 1 LA 1 1 A Ail 1 VjA I \, II 1 1 1TGAAGGC 


TGCTTATGTT 


TTTCCTAAGA 


7140 


TTTTT'FPAAA AATATAT^PTA Tr*Af , a'T'ar i r*r' r , iwrf¥V"«wwnmr> 
***** *V-/%rtrt 1 In Jl^AvjA 1 AuLu Gl 1 I\a J7GTI A* 


TTCTTCAGCT 


TGGTTTTTGT 


7200 


ATTAATTTGA AACATAAGGA ACAAATCCTT CATAGTAACC 


TAATGCTCCC 


ATAAGTTCAA 


7260 


AAGCTTGTTT TCTAATTCAA ACCATTGCAA CTCAGATTTC 


AGCTTTTCAG 


ATAAATCCTG 


7320 


CTCATCCAAA TAATGACTTG AAATTAGTGC TGAACTCGTT 


TCTGTATCCT 


GTACAGGCTG 


7380 


AGCACCCATA CCAGCAAAAA ATAAACTCGT TCCTAGCAAG 


ACCGAACAAG 


CTCCTATTGC 


7440 


ATATGGCCTC AAAGAAAAAC GCTGCTTTCT CTCAAATTGA 


AATTCTTTCA 


TCCCATCTCC 


7500 


CATCATTCAT TATTACTGTA TATTTTGTAT ATCAGAAATA 


GTTTGTATTC 


ACAAATCTTT 


7560 


CTAGTTATTC CCTTATCATT CCTAATTAAG GGAGATAACA 


TACAATAATT 


TTTAGTTAAA 


7620 


TGTATATCGA TGTTTTTTGT TTTTCTTAAT AAACGCAATA 


CAAAAAGAGC 


CTGTTACCAA 


7680 


GCTCTTTGTA CTCAATGAAA ATCAAAGAGC AAATTAGGAA 


ACTAGCCACA 


GGTTGCTCAA 


7740 
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AACACCGTTT 


TGAGGTTGCA 


GATAGAACTG ACGAAgTCAG CTCAAAACAC TGTTTTGAGG 


7800 


TTGCAGATAG AACTGACGAA GTCAGTAACA TCTATACGGC AAGGCGACGC TGACGTGGTT 


7860 


TGAAGAGATT 


TTCGAAGAGT 


ATTAGTCTAT TATTTCTTCT CAGCGCGAAG GGCTGACAAG 


7920 


ATTTGTGTTC 


GGATATCATC 


CACACCATTT GGAGTATTTG GTAAAAAGAT AGTTTGATTT 


7980 


CCTTTAGAGG 


CAAAGGTATT 


CAAGGTATCC AAATACTGGT TGGTCAAGAG GATAGACATG 


8040 


ATTTGTTCTT 


CTGTCATGCC 


AACATTGGCT TCCTTGAGTT CGGTGATAGA CTCTGCCAAT 


8100 


CCATCCACAA 


TCGCCTTACG 


TTGTTGGGCA ATCCCCACAC CATGAAGGCG GTCTTTTTCT 


8160 


GCTTCTGCTT 


CAGCTGCAGT 


GACAATTTTA ATCTTGTCAG CTTCCGCCAA TTCTTGTGCT 


8220 


GCGACCCGCT 


TACGTTGCGC 


CGCATTGATT TCATTCATGG ATTGCTTAAC TTCTGCATCT 


8280 


GGTTCGACCT 


TGGTAATCAA 


GGTTTTCACG ATAATGTAGC CGTAAGTGGT CATTTCTTCT 


8340 


GCTACTTGGT 


GTTGAACTTC 


AAGGGCAATC TCATCTTTTT TCTCAAACAA TTCATCCAAG 


8400 


GTTAATTTTG 


GAACAGAAGA 


GCGAAGAGCA TCTTCGATAT AAGATTTAAT CTGAGATTCT 


8460 


GGACGTATGA 


GTTTATAGTA 


AGCATCTGTC ACGCTCTGCT CGTTGACACG GTACTGAGTC 


8520 


GCTACATTCA 


TCATAACGAA 


CACATTGTCC TTGGTCTTAG TCTCAACCAC AATATCACTT 


8580 


TGCAACAAGC 


GCAACTGAAT 


CCGTGCTGCA ATCGAGTCAA TCCCAAAAGG CAAGCGAATA 


8640 


TGAATACCGC 


TATTAGCAAC 


CTTTTGGTAT TTCCCAAAGC GTTCAATAAT CGCCACCGAC 


8700 


TGCTGACGAA 


CCACATAAAC 


TGTACTCAGT GTGACTATCA CCAATAGGAG CACACAAACA 


8760 


ATCAGAAAAA 


TCATGAAAAA 


TATTGCCATA ATGGAACCTC CACAAGTATT TTTCTAGTAT 


8820 


TATAGCACAT 


TTAAAGAAGG 


CTGTGCCGTT TTTACTGCGA TTTTTCCTGA AATGTCAATA 


8860 


ATTAGAGGTG 


AATTGTCCTA 


TTGTCGTCCA ATCTCTTGCT AAAATAACTC TTTATAAAAG 


8940 


GCAATCGTTT 


CTTCTAAGGT 


TGGCATAAAT GGATTTCCTG GTGCGCAGGC ATCAATCAAG 


9000 


GCATTCTTAG AAAGGTATTC AAAGTCGAAA TCTTTTTCTT CAATACCAAG TTCAGTCAGT 


9060 


TTCTTAGGAA 


TACCTACTGT 


CTCAGAAAGC TTCTCAATCT CAGCAATCGC ATAATCGGCA 


9120 


CATTCTTGAT 


CTGATTTACC 


TTCTACATGA AGTCCCAAGG CTTTGGCAAC ATTGCGGAAA 


9180 


GCTTCTGGTA 


CACGTTTAGC 


ATTTTCACGT TCTATAACTG GTAGCAACAT GGCACAGCAC 


9240 


ACGG 






9244 



(2) INFORMATION FOR SEQ ID NO: 69: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8898 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



GATCTGAACT 


TTATCATCAT 


AACTTAATTT 


CATAATAAAA 


ACACCCCAAA 


AGTTAGATTT 


60 


TTTCTGTCTA 


ACTTTTGGGG 


TGTAGTTCAG 


TCATTGGACT 


GACGTTTTTT 


TGTATGCTTA 


120 


TTTTGATTTG 


ATGTAGTTGA 


TACCATCTGC 


TTTTGGTGCG 


ACTGCTTTTC 


CAAAGAAGGC 


180 


TGCTAAGACA 


AGAATTGTCA 


AAACATAAGG 


TGCAATTTGA 


AGATAAACCG 


CTGGCACTCC 


240 


TTGTAGGAAC 


GGCAATTGAG 


AACCGATAAC 


AGCCAAACTT 


TGTGAAAGTC 


CAAAGAAGAG 


300 


ACTAGAAAGC 


ATAGCACCGA 


TTGGATTCCA 


TTTCCCAAAG 


ATCATCGCAG 


CAAGGGCGAT 


360 


AAATCCAGGT 


CCAACAATAG 


TTGTCACTGA 


GAAGTTAACT 


GAGATTGATT 


GCGCATAAAT 


420 


CGCTCCGCCA 


ATTCCACCTA 


GAAAACCTGA 


AATAATAACC 


CCTAAATATC 


TCATCTTGTA 


480 


GACGTTGATT 


CCCAAGGTAT 


CCGCTGCTTG 


AGGATGTTCA 


CCGACAGAGC 


GGAGACGAAG 


540 


ACCAAATTGA GTCTTAAAGA GAATAAACCA AGCAAGGAAT GAGAAGGCAA 


TCGCCAGATA 


600 


ACCAAGTAGA 


CTAGTTGACT TGAAGAAGAT ATCACCAATC ACTGGGATAT TTGCCAAGAC 


660 


TGGGAAATCA 


AAGCGTCCAA 


AAGTTTGACT 


TAGGTTGTCG 


GTTTGTCCTT 


TGTTATAAAG 


720 


AACTTTAACT 


AAGAAAACAG 


CCAAGGCAGG 


CGCCATCAAG 


TTCAATACCG 


TACCGCTGAC 


780 


AACATGGTCT 


GCACGGAAAT 


GAACCGTCGC 


TGCTGCGTGG 


ATGATAGAGA 


AAACACTACC 


840 


AACCAATCCT 


GCTACAAGCA 


AGGATAGCCA 


TGGAGTTGCT 


GCTCCAAATT 


GTTCTGCAAA 


900 


TTCAAGGTTA AAGACAACTC 


CAGAAAAGGC 


ACCCATAACC 


ATAATTCCTT 


CAAGGCCAAC 


960 


GTTTACCACA 


CCACCACGTT 


CAGAGAAAAC 


ACCACCGATA 


CTTGTAAAGA 


TGAGAGGTGC 


1020 


TGAGTAAATC 


AGCATAGAAG 


ACACCAAGAG 


GGGGAGCAAG 


GTTATAATAG 


ACATCTTTAC 


1080 


TTACCTCCTT 


TAACTTGTTT 


TTTCGGTTTG 


ACAAAGCGTT 


CGATAAGGTA 


ATGAACACTG 


1140 


ACAAAGAAGA 


TAATAGACGC 


TGTTACAATG 


CTGACAAGCT 


CAGATGGTAC 


CTGCGCCGCA 


1200 


TTCATACCAG 


GAGCCCCAAC 


TTGGAGAACG 


CCAAATAGGA 


AGGCTGCAAA 


GAGTATACCA 


1260 


ATTGGTGAGT 


TGGCCGCAAG 


CAAACTAACC GCCATTCCGT TAAATCCGAT AGCTAATGAC 


1320 


GAACCTTGAA 


CATAGACGTT 


CTGGAAGGTT 


CCCAAACCTT 


CAACAGCTCC 


ACCAAGACCT 


1380 


GCCAAGGCAC 


CTGAAATAAT 


CATAGATAGG 


ATAATAGTCC 


GCTTGGCAGA 


AATACCAGCA 


1440 


TATTCTGAAG 


CATGTGGATT 


AAGACCAACT 


GCACGGATTT 


CAAAACCAAG 


AGTTGTTTTC 


1500 


TTGAGCATGA ACCAAATAAC TGCAACGGCA ATGATGGCAA AGAAAATACC 


AATATTCATC 


1560 


CGTGAGTTAC CAGTCAACTC AGCCAACCAA GGTGTCTGAT 


AGGTTGCATT 


AGCCCCAACA 


1620 
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CGAATGGTCG AATCTGTACT 


TTGCATGAAG 


TCTTTAGGGA 


AAGCATGGAT 


AAAGGCATTC 


1680 


CCTACATACA 


AGACAATGTA 


GTTCATCATG 


ATGGTTACAA 


TAACCTCTGA 


CGTCCCTAGA 


1740 


TAGGCCCTAA GAATACCTGG AATCGCTCCG ACAATCCCAC CAGCAATCAA 


GGCAATCACG 


1800 


ATGGTTGCTA GAATCATCAA GGGACGGGGC ATATCTGGAT GCGACAGGGC 


AAACCAACCA 


1860 


CTGAGAATCC AACCTGCCAA AGCCTGACCA GGAAGTCCGA CGTTAAAGAA 


ACCAGCTGGA 


1920 


CTGGCAACGG CAAAACCAAG ACCAATCAAG ACCAGAGGAC CCATAGCACG 


GAAGATTTCT 


1980 


CCAATCCCAC 


GCAGACTGCC 


AAAGGCTGTA 


TAGAACAATT 


CTTCGTAGCC 


CCAAATAGCA 


2040 


TCATAACCGA 


AGATCCACAT 


GACAATGGCT 


CCGAGTAAAA 


TTCCTAGGAA 


TACAGAAATC 


2100 


AAGGGAACCG 


AAATTTGTTG 


TAATTTTTTA 


GACATCACTC 


TTCTCCTTTC 


CCAAGTTTCC 


2160 


ACCAGCCATC 


AAGACACCAA 


GTTCTTGTTT 


ATTGGTTGTT 


TCTGGTGATA 


CAATACCTTG 


2220 


AATCTTACCA 


TCGTGGATAA 


CGGCAATACG 


GTCTGAGACG 


TTTAAAATCT 


CATCCAATTC 


2280 


AAAGCTGACA 


ACAAGGACAG 


CCTTGCCATT 


ATCACGCTCT 


TCAATCAAGC 


GTTTGTGGAT 


2340 


ATACTCAATG 


GCACCGACAT 


CCAACCCACG 


AGTTGGCTGG 


CTAACGATAA 


GGAGATCAGG 


2400 


ATCTCGATCA 


ATTTCACGAG 


CAATAATTGC 


TTTTTGTTGA 


TTTCCTCCTG 


AGAGTGCAGC - 


2460 


TGCAGGAACT 


AATTCACTGG 


CAGCGCGAAC 


ATCAAACTCT 


TCCATCAGCT 


TTTTAGCATA 


2520 


AGAAGTAATA TTTGAATAAT TCAAAATTCC ATTTTTACTA TGTGGTTCTT 


TATAGTAGGT 


2580 


TTGAAGGGCA ATATTTTCAG ATATCATCAT TTCCAAAATG AAGCCATCAC 


GGTGACGGTC 


2640 


TTCTGGAACG 


TGCCCAACAC 


TTAGTTCTGT 


AATGTGACGT 


GGGTGCAAGC 


CTACAATTGA 


2700 


ATGTCCTTTT 


AGCTCAATGC 


TACCAGATTC 


AACCTTACGA 


AGACCTGTAA 


TGGCTTGAAT 


2760 


CAGTTCAGAC 


TGACCATTTC 


CATCAATCGC 


CGCAATACCA 


ACAATGTCTC 


CAGCACGAAC 


2820 


ATCCAAGGAC 


AGATTTTTAA 


CAGCTGGAAC 


ACCACGGTTT 


TCATTGACCA 


CCAAATCTTT 


2880 


GATAGACAAA ACCACTTCTT 


TTGGTTTAGA 


GGCTTGCTTC 


TCTGTTTTAA 


AGGAAACAGA 


2940 


ACGTCCTACC 


ATCATTTCCG 


CCAAATCAGC 


ATTGGTAGCC 


CCTGCAATTT 


CAACGGTTTC 


3000 


AATTGATTTC 


CCACGACGGA 


TAACTGTAAC 


ACGGTCAGAA 


ACTGCTCGAA 


TTTCATCCAA 


3060 


TTTGTGGGTA 


ATCAAGATAA 


TTGATTTTCC 


TTCTTTGACA 


AGATTTTTCA 


TAATAGCCAT 


3120 


CAACTCATCA 


ATTTCTGATG 


GAGTCAAAAC 


AGCCGTTGGT 


TCGTCAAAGA 


TAAGGATATC 


3180 


AGCCCCCCGA 


TAAAGTGTTT 


TTAAAATTTC 


TACACGTTGT 


TGGGCTCCAA 


CTGAGATATC 


3240 


TGCTACCTTG 


GCAGAAGGGT 


CAACAGCTAA 


GCCATAACGT 


TCAGAAAGAG 


CCTTGATTTC 


3300 


TTTGCTAGCT 


CCAGCGATAT 


CTAGCACACC 


ATTTTTAGTC 


AATTCACTAC 


CTAAAATGAT 


3360 
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GTTTTCAGCC ACTGTGAAGG CTTCAACCAA CATAAAGTGC TGGTGAACCA TCCCGATTCC 3420 

CAAGCTAGCT GCTTTAGATG GGGAGTCGAG ATTGACAACT TGACCGTTGA CCGCGATTTC 3480 

ACCACTAGTT GGTTCAAGAA GGCCTGCTAA CATGTTCATT AGCGTGGACT TACCAGCCCC 3540 

ATTTTCTCCT AAAAGTGCAT GAATTTCACC TTTTCGTAGG TGCAAGTTGA TTTTGTCGTT 3600 

GGCAACAAAT CCACCAAACA CCTTGGTAAT ATCACGCATC TCAATGACAT TTTCGTGTGC 3660 

CATGTGCTCT TCCTTTCAGA GTCTTATTTT ATTTCAATAA AACTTGCTAG TTTGTCTAGT 3720 

AGCAAGCTTT ACTTAGACAA AATGACTTTG TCTGAACTCT TAAAAAAGCG GCCCTTGGCC 3780 

GCTTCCTAAG AAATGACTTC CATCCATTAT TTTTCAGGAA CTTTTACGCT TCCATCAAGG 3840 

ATTTTAGCTT TTGCATCTTC GACAGCTTTT TTACCTTCTT CTGAAAGGTT TGTTACTGCC 3900 

AAGTCAACCC CTTTATCCTT CAATGAGTAA ACGATCACTT GACCGCCAGG GAATTCTCCT 3960 

CTTTCTGCCT TGTTAGAAAT ATCTTTTACA GTTGTACCAA CTTGTTTCAA AGTAGATACA 4020 

AGAACAAAGT TTGATTCTTT GCCATCTTTA GAAGTGTATT TACCTTCTGC TTCTTGGTCA 4080 

CGATCAACAC CGATAACCCA AACTTTTTCA TTTTCAGGAC GGCTTTCGTT GAGAGATTTT 4X40 

GCCTCTGCAA AGACACCTGC ACCTGTACCA CCAGCTACTT GGTAAACAAT ATCTGCACCG 4200 

GCTGCGTATT GTGCGGCTGC AATTGTTTTA CCTTTAGCCG CATCACCAAA TGAACCAGCG 4260 

TAGTCAACTT GGACTTTGAT AGATGGGTCT ACTGACGCAA CACCAGCCTT GAATCCTGCT 4320 

TCAAAACGAG AGATAACTTC AGATTCGATA CCACCTACAA AACCAACTTG TTTTGTCTTA 4380 

GTTGTTTTTG CTGCAGCCAC ACCTGCAAGg TAACCTGACT CATTATCAGC GAAAGTTACG 4440 

CTCGCAACAT TCTTTTGGTC TTTAATCACA TCATCAATCA AGACATAGTT CAAGTCAGTG 4500 

TGTTCTTTTG CTGCATCTTT AACTGCATTA TTAAGGGCAA AACCAACACC GAAGATTAGG 4560 

TTGTAACTTC CAGCCGCTTG TTGCAAGTTG TTAGCGTAGT CAGCTTCACT TGTTGATTGG 4620 

AAGTAAGTGA AACCGTTATC TTTTGAAAGA TTGTGTTCTT TACCCCAAGC CTGCAAACCT 4680 

TCCCAAGCTG ATTGGTTGAA TGATTTGTCA TCAACACCAC CAGTATCAGT GACGATTGCT 4740 

GCTTTTGTCT TCACATCAGA AGATGAAGCT GCGTTACGAG AAGAGCGGTT ACCACATGCA 4800 

GCAAGTCCAA CTGCTGCCAC TGCAACTAGG CCAAGACCTA GCCATTGTTT CTTGTTCATT 4860 

ACTGAACCTC CTAAATAAGA TGTGCAACGA TGTTGCAAGT ATGGATTGGT TGGCCACAAG 4920 

GACCGTGCCA CTCAGAGAGC GACTCAGACT AGTTTAAGTC TGTAAAAGAG TATGGAAGTA 4980 

ATTCCCCGAC CGTCATCTCG ACCGTCGATT TATCTTTTGC GACTAAGGTC ACTTTTAGAT 5040 

CTTGTTCAAA AAATTCAGCC ATCACTTGGC GACAAGCACC ACATGGCGAG ATCGGTTTTT 5100 

CAGTTTGACC ATAGACAATC AATTCTGAAA ATTCTCTTTG GCCTTCAGAT ATAGCCTTAA 5160 



WO 98/18931 



PCT/US97/19588 





565 




AAATAGCTGT 


TCTCTCACCG CAATTGGTCA AAGGATAGCT AGCATTTTCA ATATTCACTC 


5220 


CCGTGTAAAC 


ACTTCCGTCT TTAGCTACTA AAACTGCTCC GATAGGAAAG TGAGAATAGG 


5280 


GGACATAGGC 


ATGTTTGCTG GTTTCAATTG CCAGTTCAAT CAACTCAGTA GTCGCCATCT 


5340 


GCCAATTCTC 


CTTTTAAAAT AGCTACCCCA GCTGACGTTC CGATACGGGT CGCACCTGCT 


5400 


TCGACAAAGG 


CAAGAGCATC TGCATAAGAA ' CGAGCTCCAC CGGCGGCCTT GACACCCATA 


5460 


TCAGATCCAA 


CTGTTTCACG CATTAATGTA ACATCTGCTA TCGTAGCACC ACCAGTTGAA 


5520 


AAGCCAGTAG 


ATGTTTTGAC AAAGTCAGCC CCAGCTTTTT GGGCCAATTG GCAAACAACA 


5580 


ACTTTTTCTT 


GGTCTGTCAG AAGGCAAGCT TCAATAATGA CTTTCACTAA CTTATCACCA 


5640 


CTTGCTTCCA 


CTACTGCGCG AATATCTGAC TCAACCAAGG CTAAATTACC TGATTTGAGA 


5700 


GCTCCAACAT 


TGATCACCAT ATCAATCTCA TCTGCACCAT TTTGGATAGC TTCTTTTGTC 


5760 


TCAAATGCTT TCACGGCTGA AGTTGTTGCT CCCAAAGGGA AACCTACTAC TGTGCAAACC 


5820 


TTAACATCTG 


TGCCTTCAAG TCCTTTTTTA GCATGTTCAA CCCAGGTCGG ATTAACGCAA 


5880 


ACACTGGCAA 


AGTCATACTC TCTAGCCTCA GACAACAAAC TATCAATTTG TTTTTTCTTT 


5940 


GCATCTTGTT 


TTAAAAGCGT ATGATCTATA TATTTATTTA ATTTCATTTC GGTTTTCCCT 


6000 


CCATTTAGGA 


GATGATTTCT ACAATTTCAC GGATTTTTTT CACTTCATCA CTTATTTTAA 


6060 


CACATTTTTG 


GAAATCTGTA ACTAGTTGAG GTGGAATTTT TTCATTTGTG TATACTTTTG 


6120 


CAACAATTTC 


ACCCTTTTGA ACGGAGTCTC CAATCTTCTT TTCAAAAACA ATTCCTGTTT 


6180 


CATAGTCCAA GGCATCAGAC TTAACTGCAC GACCAGCACC CAGCCTCATG GCATAAAGAC 


6240 


CAAAGTCCAT 


AGCTGGAAGA GCTGAAATGA CACCCGTTTC CTGAGCAGGG ATTTCCACCA 


6300 


CATGAGCTAC ATTTACAGGA CGATAGAGGT CTTCCAAGTC TCCACCTTGG GCTTGCACGA 


6360 


TTTCCTCAAA 


CTTAGCCAGT GCTTGACCAT TCTCAAGATG TTGGTGAACT TCTTCAACAG 


6420 


TTTTGTTAAC ATTTGCCAAA CCAAGCATAA TTTGAGCCAA TTCACAAATA AAGTGGGTAA 


6480 


TATCCTGACG 


TCCTTGACCT TGCAAAATCT CCAATGCTTC AAGGATTTCC AGACGATTTC 


6540 


CAATCGCTCG 


TCCCAAAGGC TGGCTCATAT CCGTAATCAC TGCTACTGTC TTCCGTCCAA 


6600 


CAACCTTACC AAGATCTACC ATAGTTTGAG CCAACTCACG CGCCTGATCA ACCGTCTTCA 


6660 


TGAAGGCACC 


CTCACCGACA GTCACGTCTA GCAAAATAGC ATCCGCCCCT GCCGCAATTT 


6720 


TCTTGCTCAT 


CACCGAACTC GCAATCAAAG GAATCGTGTC GACAGTTGCG GTCACATCAC 


6780 


GAAGGGCATA 


GAGAAGCTTA TCTGCTTTGA CCAGCTGGTC TGATTGCCCA ATGACAGATA 


6840 


CTCCAATATC 


CTGAACGTGA CGAATAAAAT CCTCTTGACT ACGTTCTACT TGATAGCCCT 


6900 
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TAATGGACTC CAATTTATCA ATTGTTCCGC CTGTATGGCC 


AAGACCACGA 


CCACTCATTT 


6960 


TTGCTACAGG CACACCGAAG CTAGCAACAA GAGGAGCTAA 


AATCAAGGTT 


ACCTTATCGC 


7020 


CGACACCACC AGTAGAATGC TTGTCAACTT TCACACCATC 


AATGGCTGAC 


AGGTCAAACT 


/ uou 


CTTGCCCAGT CTTAACCATA TTCATCGTTA AATCAGAGAT 


TTCTCGAGTC 


GTCATTCCTT 


7140 


TAAAATAAAC AGCCATAGCA AAGGCAGACA TCTGATAATC 


AGGAACAGTT 






AGCCTTCTAT CAGCCATTCA ATTTCACTTG AAGTCAGTTC 




11111X11 


7260 


GGATTAAATC AACTGCTCTC ATTCTTTCAC ACTTCTAAGG 


ATATAGTATT 


CCTTGTCTTT 


7320 


TTTAAGGATT TCACAATTGC CAAACACATC TTCCATCTTA 




1 ±\jwttA.lVl_ 


7380 


TTGTTTTTTC TGGATGACGA TGGTCAAATC TCCACCAATT 


TPPAAfJAA AT 


CTTTACTTTT 


7440 


CTCGATGATT TCATGAACGA CTTGCTTGCC CGCACGGATA 


\j\jrVc\3t\ 1 luu 


AflA i\jAL_A Iaj 


7500 


GTCAAATCGC CCTTGAACTC TTGCATAAAT ATTAGATTGA 


AATATrvrppe; 


CTTTTGCATT 


7560 


ATTTTTTTCA GCATTTCTCT GAGCTAAATC CAGGGCACGA 


1 i*f~l'i u 1 * Ti R ft 
Uxf\it\L 


t-AACCATGGT 


7620 


CGCCTGAACT CCGTAAACCT TGACCAAGGA CAAACCTAAT 


VjL) AL. L_A J. f\J\{. 


CAUAGCCTAC 


7680 


ATCTAGGACT GTCTCTCCTT GGTTGACATC CAGACACTTG 


AGCAAGAGTT 




7740 


GTCAACCATT TTCTTGCTAA AAACACCCGC ATCTGTCAAA 


AAAGTCATTT 


1 1 rv- 


7800 


CAAGTCCACT CTCAACTCAT GAATGTCGTG AGCAGCGTCA 


GGATTTTCTG 


CATAGTACAT 


7860 


TTTACTCATG ACACTATTTT ACCATAATTT GACTCAAATT 


GTAAATCGTT 


TACAAATTGA 


7920 


TAATAAAACG AAAAAGACCG AAGAAAGCAA GTGACGAAGC 


CATTTTCTTC 


AATCTCTTTC 


7980 


AACACTTATA AATAATAAAC CATTTAGAAC TATAAATATC 


ACAGTCCAGA 


TAAAAACAAA 


8040 


AAGTTTATCA TCTATAATCA GGCAGATTAT TATTTCTATT 


GCTTAACCTT 


AAAATACTTT 


6100 


ATTATCAACA AAATTCCTAA CAAAATGTTT AGATAAAAGC 


CCAACTGATA 


CGTTTATGTC 


Ol DU 


AGGATTTCCA AACTTGTCCA AAGTCGTATC AAATCTTCTA GTGACATGTG 


GAAGAAATAA 


8220 


CCCTCTGTCG CAATCCGTAG GACTAAAAAG CAATAACTAC 


CCGCAGCAAT 


CCATTTCGTC 


8280 


CATCGTTTTT TAGTAAGAAA GCAATTAAGA ACGAACAAAT 


AAAGACAGCT GTTACAATAG 


8340 


CATGTTCCAT CAAAAAAGTA AAACCGTAAT AGGTTTCCAC 


AAAGCATCTA 


CCATTATCTG 


8400 


CATTGGTTCC TTTTATAAAA GGTAAAGCAA AACTTAAAAT 


AAAACAGAGT 


TCCAATATGT 


8460 


AACGTTTTAA GATTTTCATA GTACACCTCC TATAAGTTGT GAACTAAAAA GCCCCCTTTA 


8520 


TAAGCTTATA AATCAGTAGA ATCTATCTCC TATTTCATCA 


ATAAATTGAT 


CACTTATACT 


8580 


ATATACCATT GACTTACCAC ATTCAAGAAA CCGCTTTATT 


TTTTTAGCTT 


TTTATGGTAT 


8640 


GATAGACAAA ATATCTAGGG GAAAACAAAT GACCAACGAA 


TTTTTACATT 


TTGAAAAAAT 


8700 
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CAGCCGCCAG ACTTGGCAAT CTTTACATCG AAAGACAACA CCTCCTTTGA CAGAAGAAGA 8760 

ATTGGAATCT ATCAAGAGTT TTAATGACCA AATCAGTCTC CAAGACGTTA CAGATATCTA 8820 

TCTCCCCTTG GCTCATTTGA TTCAGATTTA CAAGCGAACT AAGGAAGATT TAGCCTTTTC 8880 * 

AAAAGGAATT TTCCTCCA 8898 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13188 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 70: 

TATCTTAACG aGGATTGGGT TTATCGTCAG TCTTATTGCC CTAATTGTGG GAACAATCCC 60 

TTAAATCATT TTGAAAATAA TCGGCCTGTA GCAGATTTTT ACTGTAATCA TTGTAGTGAG 120 

GAGTTTGAAC TAAAGAGCAA AAAAGGAAAT TTTTCATCAA CAATCAATGA TGGTGCTTAT 180 

GCAACGATGA TGAAGCGTGT GCAGGCAGAT AATAATCCTA ATTTCTTTTT TTTAACTTAC 240 

ACAAAAAATT TTGAGGTAAA TAACTTTCTT GTCCTTCCGA AGCAATTTGT TACACCGAAA 300 

TCGATTATTC AAAGAAAACC ACTTGCACCA ACTGCTAGAC GAGCAGGTTG GATTGGTTGT 360 

AACATTGATT TATCACAAGT ACCTTCTAAA GGAAGGATAT TTCTTGTGCA AGATGGACAA 420 

GTTAGAGATC CAGAAAAAGT TACAAAAGAA TTTAAGCAAG GTTTATTTTT AAGGAAGAGC 480 

TCTCTGTCAT CAAGAGGTTG GACAATAGAA ATTCTAAATT GTATAGATAA GATAGAGGGT 540 

TCAGAATTTA CCCTTGAAGA TATGTATCGT TTTGAAAGTG ACCTAAAAAA TATCTTTGTT 600 

AAGAACAATC ATATCAAAGA AAAGATTAGG CAACAGCTTC AAATATTAAG AGACAAAGAA 660 

ATAATAGAAT TTAAAGGTAG AGGAAAGTAT CGGAAATTAT GAAAACGAAA CAACTTGTTG 720 

CATCAGAAGA GGTGTATGAT TTCTTAAAAG TCATCTGGCC TGATTATGAA ACTGAAAGCC 780 

GTTACGATAA CCTAAGTTTA ATCGTCTGTA CCTTATCAGA TCCCGATTGT GTGAGATGGT 840 

TATCTGAAAA TATGAAATTT GGTGACGAAA AACAACTAGC TTTGATGAAG GAAAAATATG 900 

GGTGGGAAGT AGGAGATAAA TTGCCAGAGT GGCTACATAG CTCCTATCAT AGATTATTGT 960 

TAATAGGTGA ATTATTGGAA AGCAATCTAA AACTGAAAAA GTATACAGTA GAAATTACAG 1020 

AAACTTTATC ACGTTTAGTA AGTATAGAGG CTGAAAATCC AGATGAAGCC GAACGACTTG 1080 

TAAGAGAAAA GTATAAGAGT TGTGAAATTG TTCTTGATGC AGATGATTTT CAGGACTATG 1140 
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ACACTAGCAT ATATGAATAG GTAGATGTTT TTATTTTGTC AACAAAAAAG AGGCTCGCAC 1200 

CTCTTTTTCT TATTTCTTTT TATGATTTAA TACGGCATTG AGGACAATAG CGAGTAGGCT 1260 

GGCTACGACG ATTCCGTTTG AGAAGAACAT TTGGAAGGCT GTCGGCATGC TGACAAAGAG 1320 

ATTACTGTTG TTGAGACCGA CACCTGCAGC GATTGAAACA GCTGCGATAA GGAAGTTGTG 1380 

TTCATTGTTA GCAAAGTCAA CACGGGCGAG GATTTGCATC CCTTGAATTG ATACAAAAGG 1440 

AAACATTACC AGCATGGCAC CACCGAGGAC GGAGCTTGGA ATGATTTGGG CAAGGGCGCC 1500 

AAACTTAGGA AGCAGTCCAA GGAGAACCAG GAAACCAGCT GCGTAGTAGA TTGGCAGGCG 1560 

TTTTTTGATG CCTGACAATT TAACCAAACC AACGTTTTGT GAAAATCCGG TGTAAGGGAA 1620 

GGTGTTAAAG ATTCCTCCGA GAAGTACGGC CAAACCTTCT GCGCGGTATC CGTTGCGAAG 1680 

GCGCGTGCTG TCGATTGGAT CCTTTGTGAT ATCAGACAAG GCCAGATAAA CACCAGTTGA 1740 

CTCAACCATA GACACCGTTG CGATGATACA CATCATGACA ATAGATGAGA TTTCAAAGGT 1800 

TGGCATCCCA AAGTAGAGTG GAGTTGGGAC ATGGACAAGT GGAGCTACCG CAACAGGAGA I860 

GAAGTCCACC AAGCCCATAG TAGCAGCAAT GGCAGTTCCA ACAACCAGAC CAATCAAAAT 1920 

AGAGATAGAC TTGATAAATC CTTTGGTAAA GATGTTGATC AAGAGGATAA TCAGAACAGT 1980 

AATAGCTGCA AGCAAGAGAC TTTGACCAGT TGGCTCTGGA ACGTTATTTC CCATATTTCC 2040 

AATAGCGACA GGGATCAAGG TTAAACCAAT CGTGGTAATA ACAGATCCTG TTACGATAGA 2100 

TGGGAAGAGA TTGGCTACTT TTGAGAAGAT GCCTGAAACA AGAACCACGT AAATCCCAGA 2160 

TGCGATAAGG GCACCAAACA TAGCGCCACT ACCATGGCTT TGCCCAATGA TAATCAAGGG 2220 

AGCGACCGAC TGGAATGCAA CTCCAAGAAC GACTGGGAGT CCAATCCCAA AGTATTTGTT 2280 

GAGTTGGAGT TGGAGGAAGG TTGCCACCCC ACACATGAAG ATATCTGTAG AAATCAGGTA 2340 

GGTCAACTGC TCAGCTGAAT AGCCAAGGGC TGTCGCAATC ATGATGGGAA CCAGGATAGA 2400 

TCCTGAGTAC ATGGCTAGTA AGTGCTGCAA GCCAAGAACG GCTGCTTGCG AGTGTTTTTC 2460 

TTGAGTTTGC ATTAGAGATC TGCCTCCTTA AATACGACTT GACCATTTTC AAAACAATCC 2520 

AAACGAGCAA GTGATAGGAC AGGGTAGCCT GCTTTTTCAA GCAAATCACG ACCATCTTGG * 2580 

AAGGATTTCT CAATCACGAT ACCGATAGCT TGGACTGTGG CACCGGCCTG TTCGATGATT 2640 

TGAATCAAGC CTTTAGCAGC TTGGCCATTA GCAAGGAAAT CGTCGATAAT CAAAACGTTG 2700 

TCCTCTGGTG AGAGGAATTT TTCAGCGATA GAAACGGTGC TGGTCACCTG CTTGGTAAAG 2760 

GAGTAGACTT GAGCAGTTAA GATGCCTTCG TTCATGGTGA TGTTCTTAGC TTTTTTGGCG 2820 

AAAATCATGG GAACGTTTAA GGCTTCAGCT GTAAAAACGG CTGGGGCAAT ACCCGACGCT 2880 

TCAATGGTTA CGACCTTGGT AATGCCAGTA GTAGCAAATT TTTCCGCAAA AACCTTACCA 2940 
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ATCTCTCGCA 


TCAAGCTAAA 


GTCAACTTGG 


TGGGTTAAAA 


AGGAATCTAC CTTGAGGATG 


3000 


TTATCACCCA 


AGATATGCCC 


ATCCTTGAGG 


ATGCGCTCTT 


CTAATAATTT CATAAGACCT 


3060 


CCTAAAGTCT 


AAAAGTTAAT 


TTACTTGTTG 


TTTAAATATT 


TCTATAGTGA TCCCTTTTGC 


3120 


TAATACTATA 


TATTTGATAA 


AACTATTACG 


AGCGAAGCGA 


GTCTTATCAA ATATTTCCCG 


3180 


TTGTAGTGGT 


ATCATAGACA 


ATAATCTTGT 


TATTGTCTAT 


GACGGGATTT TTGAGAGTAA 


3240 


AATAGTTCGG 


GGAACTATTT TAGCCTAAGC CTAGAAATGA AAGAGCTAGG GGCTCAAAAA 


3300 


TTAGGGATGA 


AATTCCCTGG 


ATTCCTGAAA TTATTCACAG 


GATAATTTCA CCTCCCGTCC 


3360 


GCACTAATTA 


AGGGAAATAT 


TAAAAAAAGA 


CCTACTTAAT 


CTCTAAGTAA GTCCCCTAAA 


3420 


TAGACATGGC 


AAAAACGGCC 


ATATCTCACT 


GCTGACTTAC 


TTATTGTTAG GTGTTCCGGG 


3480 


ACCTTGTAGA 


AACGTCGTGC 


CAATTCACGA 


CATAAACAAG 


TAAAACGATA TTCAATTTTA 


3540 


AATAGGCTTG 


AGCCAATGTT 


TTTATTTTAC 


ACTAAATAAC 


TTTAGAAATC AACTATTTTG 


3600 


TTAGTGTTTT 


GGTTTAAAAA 


ACGAACAAAA 


AGAAGAGAGG 


GTGAACAAAA ACTCCATTGT 


3660 


AAGCTAACAG 


TTATACTAAA 


TGAAAATCAA 


AGAGCAAACT 


AGGAAGCTAT CCACAACCTC 


3720 


AAAACACTGT 


TTTGAGGTTG 


TGGATAGAAT 


TGACAGAGCC 


AGTATCATAT ACCTACGGTA 


3780 


AGGCGACGTT 


GACGTGGCTT 


GAAGAGATTT 


TCGAAGAGTA 


TTAGAAGATT TTTCCATCAT 


3840 


AAAAGGCATA 


CTATCAAGCT 


TTTAGACACC 


TGACAATATG 


CCTTTTTCTA ACTTTAAAGA 


3900 


CTTTTCCCAA 


TTTTTATTAT 


TCTACTCGCT 


AAATCTTAAA 


AAATAGCCAT CTGGATCCAA 


3960 


AACTGCAAAT 


TTATGAGGAT 


AGATATAGGG 


ATCACTGACA 


CGAAACTTTC TTTTGGTCAA 


4020 


GGGACGATAA 


ATAGGATAGT 


TTGCCTTCAT 


CACTCTTTAA 


TAGAGTTTTG AAACATCCTT 


4080 


TATGCCAAAG 


GAGAGATTGA 


CTCCACGACC 


AAAGGGATAG 


GTCAGTTCAG CTAGTTGATG 


4140 


CTTTGTTCCC 


TCCTCTAACA 


TTAGTTGACA 


CTCTTCAAGA 


GAAAGAGAAA GTTTTCTTCT 


4200 


GGACGTTGGT 


ATTCAATCCT 


AAAACCCAGT 


AAACCACAGT 


AGAAGGACCG GGACTGTTCG 


4260 


ATATTCGATA 


CAAGCAACTC 


GGGAATGACC 


GCATTGTAGT 


CCATATAGAA AATCCTTACA 


4320 


AGTCAATTTC 


CAAGACAATC 


GGTGTATGGT 


CTTGGCGAGC 


ACCTGAGTCA ATCATATCAG 


4380 


ATTTAGTGAC 


CTTGTCAGCG 


ATACGGTTAC 


TTGTGAGCCA 


GTAGTCGATT CTCCAGCCTG 


4440 


TATTGTTGAT 


TTTAGAAGTT 


TTGCTGCGTT GTGCCCACCA AGTGTAGCGT TCAGGAACAT 


4500 


CGCCATGAAC 


ATGGCGGAAG 


GTGTCTGTAA ATCCAGTTGC 


CAAAAGGTTG GTAAATCCAG 


4560 


CACGTTCCTC 


GTCAGTAAAT 


CCAGGTGAAC 


GGCGGTTGCT 


AGCAGGATTT GCAAGGTCGA 


4620 


TTTCATTGTG 


GGCTACGTTG 


TAGTCACCGG 


TCGCAAGGAC 


TGGTTTTTCT TTGTCTAGTT 


4680 



WO 98/18931 



PCTYUS97/19588 



570 



CAGCCAAATA 


GTCAGCATAT TTGGCATCCC 


AGACTTGGCG 


TTCTTCCAAG 






CGTCACCAGC 


GTTTGGAGTG TAAACTTGGG 


TTACGAAAAA TGCATCAAAT 




A Qftft 


TGATACGACC 


TTCCAAGTCC ATGGTAGAAG GGGCACCGAT 


TTCTGGGAAG 




A OCA 


GTGTAAGTTC 


TTTCTTATAA AGGAACATGG TTCCAGCATA 


GCCTTTACGG 




4920 


GGGAAGAGCG 


CCACGTGTTT TCGTAGCCTG 


GGAAGAGTTC 


TTCTAAAATT 




4980 


TCTTTGTAGG 


TCCTTTGGCA GAAAGCTTGG 


TTTCTTGGAT 


AGCAATGATA 


TCAGCATTTT 


5040 


CAGCGACCAA 


GGTTTGTAGG ACTTCTTGGG 


ACAATTTGGC 


ACGAGCTGAG 


TCACTAGTTA 


5100 


GGGCAGCGTT 


TAGGGAATCA ATATTCCATG 


AGATAAGTTT 


CATAAAGTTA 


CCTTTTTCAT 


5160 


TCAGATTATA 


GATTTTATTA TACCAAAAAA 


AGATCTATTT 


CCCCAACGTA 


TGGTTTGAAA 


5220 


AATTACTCTC 


TTTCGTTTAT AATTAAGAAT 


GATTTTATGA 


AAGGGAGTGA 


AAATACATGA 


5280 


AATTCTACTC 


TTATGACTAT GTACTCAGCC 


AAATCGGTCA 


GCAAAATGGT 


ATCATGGTTG 


5340 


GCTTTGGGAT 


TGTTCTATTA GCTGTGACAG 


TTTTTTTTGC 


TTTCAAGGCA 


TACCATAATA 


5400 


AAAAGGGAAG 


CGAATTTCGT GAGTTGGTCA 


TGATTTCAGA 


TCTGGCCTTA 


TTTAGCTCTG 


5460 


CTTTTGGTCA GCATCACGAC TTATCAAAAC 


AATCAAGTTT 


CTAACAATAA ATTTCAAACT 


5520 


TCACTTCATT 


TCATCGAGGT TGTTTCCAAA 


GATTTGTGAG 


TAGACAAGTC 


AGAAGTCTAT 


5580 


GTTAATACTT 


CCACAAACAC AGATGGCGCA 


CTTATCAAGG 


TGGGAGATCG 


CTATTATCGT 


5640 


GCCCTAAATG 


GAAGTGAGCC AGACAAGTAC 


CTGTTAGAGA 


AAGTCGAATT 


GTATAAGACA 


5700 


GACGCAATTG 


AACTGGTGGA TGTGAACAAA 


TGACACTTAA 


TTATATCGAA 


ATTTTAATCA 


5760 


AACTGGTCTT 


GACTCTCAAA TAGCTCAACA 


ACAATGTTCA 


CTTTGTGAAA 


CGTTTGATTG 


5820 


ATGGTAAGCC 


AACTCTCCTT ATCAAAAATG 


GGAATATTGA 


CCCAGAAGCC 


TGTCGTTCAG 


5880 


TTGGTTTGTC 


TGCATCGGAT GTATCCCTCA 


AACTTCGTAG 


CCAAGGGATT 


TTCCAGATGA 


5940 


AGCAAGTCAA 


ACGAGCTGTG CAAGAGCAAA 


ATGGGCAACT 


CATCGTTGTG 


CAAATGGGAG 


6000 


ATGAAAATCC 


TAAGTATCCA GTTGTGACTG 


ACGGTGTGAT 


TCAAGTAGAT 


GTCTTGGAAT 


6060 


CGATTGGTCG 


TAGCGAAGAG TGGTTGCTTG 


ATAACCTCAG 


TAAACAAGGG 


CATGACAATG 


6120 


TAGCCAATAT 


CTTTATTGCT GAATATGACA 


AGGGTGCTGT 


TACAGTCGTA 


ACTTATGAAT 


6180 


AAGAAAAACC 


TGGGGTCTTG TACTCTTCGA 


AAATCTCTTC 


AAACCGCGTC 


AACGTCGCCT 


6240 


TGCCGTATGT 


AGGTTACTGA CTTCGTCAGT 


TCTATCTACA 


ACCTCAAAGC 


AGTGCTTTGA 


6300 


GCAGCCTGCG 


GCTAGTTTCC TAGTTTGCTC 


TTTGATTTTC 


ATTGAGTATT 


GGCCTCAGGT 


6360 


TTCCATTTGC AATCAGAAAG GGATTTTATG TCCATTATTC AAAAACTTTG 


GTGGTTTTTC 


6420 


AAGTTAGAAA 


AACGCCGTTA TCTAGTCGGA 


ATTGTGGCCC 


TGATCTTGGT TTCCGTCCTC 


6480 
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AATCTCATTC CTCCTATGGT TATGGGGCGG GTCATTGATG CCATCACATC GGGGCAATTA 


6540 


ACCCAGCAGG 


ACCTCCTTCT 


TAGCCTATTT 


TACTTGCTAC 


TTGCAGCCTT 


TGGTATGTAC 


6600 


TATTTGCGCT 


ATGTGTGGCG 


TATGTATATC 


CTTGGGACCT 


CTTATTGCTT 


GGGACAGATC 


6660 


ATGCGGTCTC 


GCTTGTTTAA 


GCATTTCACA 


AAAATGTCGT 


CAGCCTTTTA 


TCAAACCTAT 


6720 


CGGACGGGTG 


ATCTGATGGC 


ACACGCAACC 


AATGATATCA 


ATGCCTTGAC 


TCGTTTAGCA 


6780 


GGTGGCGGTG 


TCATGTCTGC 


GGTGGATGCC 


TCTATCACGG 


CTCTGGTGAC 


TTTGTTGACC 


6840 


ATGCTCTTTA 


GCATCTCATG 


GCAGATGACT 


CTTGTTGCCA 


TTCTCCCCCT 


ACCTTTCATG 


6900 


GCCTATACGA 


CTAGTCGCCT 


AGGGAGAAAG 


ACTCATAAGG 


CCTTTGGCGA 


ATCCCAAGCT 


6960 


GCTTTTTCTG 


AACTCAATAA 


CAAGGTACAG 


GAGTCCGTAT 


CAGGTATCAA 


AGTGACCAAG 


7020 


TCTTTCGGTT 


ATCAGGCAGA 


CGAGTTGAAG 


TCTTTTCAGG 


CAGTCAATGA 


ATTAACCTTC 


7080 


CAAAAGAACC 


TGCAAACCAT 


GAAATATGAT 


AGTCTCTTTG 


ACCCTATGGT 


TCTCTTGTTT 


7140 


GTTGGTTCGT 


CCTATGTTTT 


AACGCTTTTG 


GTTGGCTCCT 


TGATGGTTCA 


GGAAGGGCAG 


7200 


ATTACAGTTG 


GGAATCTAGT 


CACCTTTATC 


AGCTATTTGG 


ATATGCTGGT 


CTGGCCTCTT 


7260 


CTGGCCATCG 


GTTTCCTCTT 


TAATACTACT 


CAGCGAGGGA 


AGGTTTCTTA 


CCAGCGGATT 


7320 


GAAAATCTTT 


TGTCTCAGGA 


ATCTCCTGTA 


CAAGACCCTG 


AGTTTCCTCT 


GGATGGTATT 


7380 


GAAAATGGGC 


GTTTGGAGTA 


TGCCATTGAC 


AGCTTTGCTT 


TTGAAAATGA 


GGAAACACTG 


7440 


ACGGATATTC 


ACTTTAGTTT 


GGCAAAAGGG 


CAAACACTGG 


GCTTGGTTGG 


GCAGACAGGC 


7500 


TCTGGGAAAA CGTCCTTAAT CAAGCTCCTC TTGCGTGAAT ACGATGTGGA TAAGGGTGCC 


7560 


ATTTATCTAA 


ACGGTCACGA 


TATTCGGGAC 


TATCGTCTGA 


CAGACCTTCG 


CAGTCTCATG 


7620 


GGCTATGTTC 


CTCAGGACCA 


GTTTCTTTTT 


GCGACTTCAA 


TCCTAGACAA 


TATCCGCTTT 


7680 


GGCAATCCTA 


ACTTGCCCCT 


TTCAGCGGTC 


GAGGAAGCTA 


CTAAGCTAGC 


CCGGGTTTAC 


7740 


CAAGATATTG 


TAGACATGCC 


TCAAGGATTT 


GATACGCTGA 


TTGGTGAAAA 


AGGAGTCACT 


7800 


CTTTCTGGTG 


GTCAAAAGCA 


ACGGTTGGCT 


ATGAGTCGGG 


CTATGATTTT 


AGACCCTGAT 


7860 


ATCTTGATTT 


TGGATGATTC 


CTTATCCGCC 


GTAGATGCCA 


AGACAGAGTA 


TGCGATTATC 


7920 


GACAACCTCA 


AGGAGATGCG 


AAAGGACAAG 


ACAACCATTA 


TCACTGCCCA 


TCGCCTCAGT 


7980 


GCTGTTGTCC 


ATGCAGATTT 


TATTTTAGTT 


CTACAAAATG 


GTCAAATTAT 


CGAACGAGGC 


8040 


ACGCACGAAG 


ACTTGCTAGC 


TTTGGATGGC 


TGGTATGCCC 


AAACCTACCA 


GTCTCAGCAG 


8100 


TTGGAAATGA AAGGAGAAGA 


AGATGCAGAA 


TAAACAAGAA 


CAATGGACTG 


TATTGAAGCG 


8160 


CTTGATGTCT 


TATCTCAAGC 


CTTATGGACT 


CCTGACCTTT 


TTGGCACTCA 


GTTTTCTCCT 


8220 
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AGCGACGACG GTCATTAAAA GTGTCATACC 


CCTCGTGGCT 


I\_CLACTTTA 


TCGACCAGTA 


8280 


TCTCAGCAAT CTTAACCAAC TAGCCGTTAC 


CGTTTTGCTG 


tj 1 trT ACT ATG 


GTCTCTACAT 


8340 


CCTACAAACT GTAGTTCAGT ATGTCGGCAA TCTTCTCTTT 


GCGCGCGTGT 


CTTACAGTAT 


8400 


TGTTAGGGAT ATTCGTCGGG ATGCCTTTGC 


CAATATGGAG 


AAACTGGGCA 


TGTCTTACTT 


8460 


TGACAAGACG CCAGCAGGTT CTATCGTTTC 


TCGTTTGACC 


AACGATACCG 


AGACGATTAG 


8520 


TGATATGTTT TCTGGGATTT TATCCAGCTT 


TATCTCAGCA 


GTTTTTATCT 


TTCTGACAAC 


8580 


CCTTTATACC ATGTTGGTGC TGGATTTTCG 


TTTGACGGCT 


TTAGTCTTGC 


TCTTTCTTCC 


8640 


TTTGATTTTC CTTTTGGTCA ATCTCTATCG 


AAAAAAGTCA 


GTGAAAATCA 


TCGAGAAAAC 


8700 


CAGAAGTCTC TTGTCAGATA TCAATAGTAA 


GCTGGCAGAG 


AATATCGAGG 


GAATCAGGAT 


8760 


TATTCAGGCC TTTAATCAAG AGAAGCGCCT GCAGGCAGAA 


TTTGATGAAA 


TCAACCAAGA 


8820 


ACACTTGGTC TACGCCAACC GTTCTGTAGC 


CTTGGATGCC 


CTCTTTTTGA 


GACCTGCCAT 


8880 


GAGTTTGCTG AAACTTCTAG GCTATGCAGT 


CTTGATGGCC 


TACTTTGGCT 


ACCGTGGTTT 


8940 


TTCTATCGGG ATAACGGTCG GGACCATGTA 


TGCCTTTATC 


CAGTACATCA 


ACCGCCTTTT 


9000 


TGACCCCTTG ATTGAGGTGA CGCAAAACTT 


TTCAACTCTG 


CAAACGGCTA 


TGGTTTCTGC 


9060 


AGGTCGTGTC TTTGCCCTGA TAGACGAGAG 


GACCTATGAA 


CCTCTTCAAG 


AAAATGGGCA 


9120 


AGCCAAAGTC CAAGAAGGCA ATATCCGTTT 


TGAACATGTG 


TGTTTCTCAT 


ATGACGGTAA 


9180 


ACATCCGATT CTGGATGACA TTTCTTTCTC 


TGTTAATAAG 


GGTGAAACCA 


TTGCCTTTGT 


9240 


AGGTCATACA GGTTCAGGGA AATCGTCTAT TATCAATGTC CTCATGCGCT 


TTTATGAATT 


9300 


CCAGTCAGGG AGAGTTCTCT TGGATGATGT 


GGATATCAGG 


GATTTCAGTC 


AAGAAGAGCT 


9360 


GAGAAAAAAC ATCGGTTTGG TCTTGCAGGA 


ACCCTTCCTC 


TATCATGGAA 


CTATTAAGTC 


9420 


CAATATCGCC ATGTACCAAG AAACCAGTGA 


TGAGCAGGTT 


CAGGCTGCGG 


CAGCCTTTGT 


9480 


GGATGCAGAT TCCTTTATTC AAGAACTTCC 


TCAGGGGTAC 


GACTCCCCTG 


TTTCCGAGCG 


9540 


TGGTTCGAGC TTCTCTACTG GGCAACGCCA 


GCTTCTTGCC 


TTTGCTAGAA 


CAGTCGCCAG 


9600 


CCAGCCTAAA ATCCTGATTT TGGATGAAGC 


GACAGCCAAT 


ATTGACTCTG 


AAACAGAAAG 


9660 


CTTGGTTCAA GCTTCTCTGG CGAAGATGAG 


ACAGGGCCGA 


ACAACTATTG 


CTATCGCTCA 


9720 


CCGCCTTTCT ACTATTCAAG ATGCCAACTG 


CATCTATGTC 


TTGGATAAGG 


GACGCATTAT 


9780 


CGAGAGTGGA ACCCATGAGG AACTCTTGGC 


TCTGGGAGGA 


ACCTATCACA 


AGATGTATAG 


9840 


TTTGCAGGCA GGGGCCATGG CCGATACTCT 


TTGAAAATCT 


CTTTAAACCA 


TGTCAGCTTT 


9900 


ATCTGCAATC TCAAAGCTGT ACTTTGATTT 


TCATTGAGTA 


CTAGAAGGAA 


ATCCTTCAAA 


9960 


TTACAGATTT CTTTCACCGC CTTTTCCATT 


TTGTGGTATA 


ATGAAAAATG 


TTGACAAATA 


10020 
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GTATAATAAA AACAAAGGAG 


AACAGCATGC TGAAATGGGA AGACTTGCCT GTGGAAATGA 


10080 


AATCAAGCGA GGTTGAGTCT 


TACTACCAGC 


TTGTCTCTAA 


AAGGAAGGGT TCGCTGATTT 


10140 


TCAAGCGTTG CTTGGACTGG 


GTTTTGGCCT 


TGGTCTTACT 


GGTTCTGACC TCTCCCATCT 


10200 


TTCTCATCTT GAGCATTTGG 


ATCAAGTTGG 


ATAGCAAAGG 


GCCAGTGATT TACAAGCAAG 


10260 


AGCGTGTGAC CCAGTACAAC 


CGTCGGTTCA 


AGATTTGGAA 


GTTTCGTACC ATGGTGACGG 


10320 


ATGCGGATAA AAAAGGAAGT 


CTGGTGACTT 


CTGCTAACGA 


TAGCCGCATT ACCAAGGTTG 


10380 


GAAATTTCAT CCGACGTGTC 


CGTTTGGACG 


AACTGCCTCA 


GTTGGTCAAT GTCCTTAAAG 


10440 


GTGAGATGTC CTTTGTCGGT 


ACACGACCTG 


AAGTGCCACG 


TTATACAGAG CAGTATAGCC 


10500 


CTGAAATGAT GGCAACCTTG 


CTCTTGCAAG 


CAGGGATTAC 


CTCTCCAGCC AGCATCAACT 


10560 


ACAAGGATGA GGACACAATT 


ATCAGTCAAA 


TGACGGAGAA 


AGGTCTGTCA GTTGATCAGG 


10620 


CCTATGTGGA GCATGTTCTT 


CCTGAAAAGA 


TGCGCTATAA 


CCTCGCCTAT CTCCGAGAGT 


10680 


TTAGTTTCTT TGGGGACATC 


AAAATCATGT 


TTCAAACCGT 


GTTTGAGGTA CTAAAATAAA 


10740 


GTAGTCATAA GAAAATGAGT ACAGATAAAA GGAGCAAATC AATGCCAAAT TACAATATTC 


10800 


CATTTTCACC GCCTGATATC 


ACAGAAGCAG 


AAATTACTGA 


AGTAGTGGAT ACCCTGCGTT 


10860 


CTGGTTGGAT CACAACAGGT 


CCTAAAACAA AAGAACTGGA GCGCCGCTTG TCTCTTTACA 


10920 


CACAGACACC TAAGACTGTT 


TGTCTCAACT 


CTGCGACAGC 


CGCTCTGGAG TTGATTTTAC 


10980 


GCGTTTTGGA AGTGGGACCT GGTGATGAAG TCATCGTTCC 


AGCCATGACC TATACGGCTT 


11040 


CATGTAGTGT CATTACGCAC 


GTGGGAGCAA 


CCCCTGTCAT 


GGTGGATATC CAAGCAGATA 


11100 


CGTTTGAGAT GGACTATGAC 


CTGCTTGAGC 


AAGCTATCAC 


TGAGAAAACT AAGGTGATTA 


11160 


TTCCAGTAGA GCTCGCAGGG ATTGTTTGCG ATTATGACCG TTTGTTCCAA GTCGTGGAGA 


11220 


AAAAACGTGA CTTCTTTACC GCTTCAAGCA AGTGGCAAAA GGCCTTTAAC CGTATTGTCA 


11280 


TTGTCTCTGA TAGTGCCCAC 


GCTTTGGGAT 


CTATTTATAA 


AGGACAACCT TCTGGTTCTA 


11340 


TCGCTGACTT TACTTCCTTC 


TCATTCCATG 


CAGTTAAGAA 


CTTTACAACG GCAGAAGGTG 


11400 


GAAGTGCGAC TTGGAAAGCC 


AATCCAGTGA 


TTGATGACGA 


AGAGATGTAC AAGGAATTCC 


11460 


AAATCCTTTC CCTTCACGGG 


CAAACTAAGG 


ATGCTCTTGC 


CAAGATGCAA CTGGGGTCAT 


11520 


GGGAATACGA TATCGTTACA CCAGCCTATA AGTGCAACAT GACCGATATC ATGGCTTCAC 


11580 


TTGGTTTGGT ACAATTGGAC 


CGCTATCCAA 


GTTTGTTGCA 


ACGCCGTAAG GACATTGTGG 


11640 


ACCGCTATGA TAGTGGTTTT 


GCAGGTTCTC 


GCATCCATCC 


TTTGGCACAC AAGACTGAAA 


11700 


CTGTCGAATC TTCACGCCAC 


CTCTACATCA 


CCCGTGTAGA 


AGGAGCAAGC CTAGAAGAAG 


11760 
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GCAACCTCAT 


CATCCAAGAA TTGGCTAAAG 


CAGGAATTGC 


AAGTAATGTT 


LrtL A t\\. AAAQ. 


11820 


CGCTTCCTCT CTTGACAGCC TATAAGAATC TTGGATTTGA TATGACGAAC 


TATCCTAAGG 


11880 


CCTATGCCTT 


CTTTGAGAAT GAAATTACCC 


TCCCTCTTCA 


TACTAAATTA 


AGCGATGAAG 


11940 


AAGTAGACTA 


TATCATTGAG ACTTTCAAAA 


CAGTTTCTGA 


AAAAGTGCTA 


ACTTTATCAA 


12000 


AAAAATGACA AACTACAGTC AAGCGAAAGT 


GATCCTGCCC 


CTAAAAAGTC 


TAATTGAGTG 


12060 


TAAAAACTGT 


TGTTTTCAAT TGATAATAGT 


TTACACCTGT 


AGTTGAGGCC 


CCTTTCTCCT 


12120 


CAGAGAGAGA 


ATTTTTATAG GATTTTCCTT 


TCTTGTGGGA 


GTCCCGTGGT 


TTGAAATAAG 


12180 


ATGTGAGCAA 


TTTAGTGTAG CATTTAGAAT 


CCTTACTAGA 


CATCATTTAG 


AAAATCTAGT 


12240 


GTCTTGTTCT 


AGTTTTCAAT TCACCCTATT 


TTTTGAAAGA 


CGTGAGTTTP 


CATGAGTGAG 


12300 


ATTGTGGAAA 


CTCGCGTCTT TTTTTGTTTT 


CAGAATATTG 




TGTGCCTGTC 


12360 


TTTCATGTTC 


TAGTCATTCT TTTGCATGAT 


AGAATTTATA 




ATTATAATAA 


12420 


TACAAATATT 


CTATATGTTT AGTGATGCTT 


GCTATACATT 




CTGCGAGACA 


12480 


ATCTATAAAA 


CACTTGTCTA CGATTACCTA 


TATGCCCTAT 


TCCAGTATTT 


TAGAAGCACT 


12540 


GCATCTATTT 


TTATCGAGGT TAAATCTAGC 


TTTTATAGAA 


GGTCTATTTA 


AGAAATATAT 


12600 


TGTAGTGTTT 


TAGTTTCAAT CCGCCATATG 


AGCGATATTC 


AGGTAAATAT 


CCCTGGCGAA 


12660 


TGCTTGTATG 


ACAAGGTATT TGTTCTTTCA 


TTTATAATTT 


ACAACATATC 


AACAAATTTA 


12720 


AATATAGTAA ATGGGATATT TTATATTCAA 


GCTAAGAAAG 


ATAGCATCAC 


TTTTGAATGG 


12780 


AAGGCTAAAG 


AGCAAACTAG GAAGTTGGCC 


ATAGATAGCT 


CAAAACCCTG 


CTTTGAGGTT 


12840 


GTAGATATAG 


TAAAATGAAA TGAGAATAGG 


ACAAATTGAT 


CGGGACAGTC 


AAATCGATTT 


12900 


CTAACAATGT 


TTTAGAAGTA GAGGTGTACT 


ATTTTAGTTT 


CAGTCTACTA 


TAGAACTGAC 


12960 


CAAGTCAGTA 


ACCTAGACTT AGGGCAAGGC 


GGCACTGACC 


TAGTTTGAAG 


AGATTTCCGA 


13020 


AGAGTATAAA 


TTTTAATATT TTCTTGTGTT 


ATTCCTTGAC 


AATTCAATTT 


GGAAAATATA 


13080 


TGATAAAGAT AATGACAGCG GTGTCATTCT ATCTATTTTA AGAAAAGTAA TAATCAATTG 


13140 


TTAAAAATAG 


TAAAAAAATT GGAGGTTCTG 


ATGAAATATT 


TTGTTCCG 




13188 


(2) INFORMATION FOR SEQ ID NO: 71 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32768 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
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AACGAGTGCA 


TCAGTCTCAG 


CAAGCACCAG 


TGCGTCGGCC TCAGCAAGCA 


CCAGCGCGTC 


60 


TGAATCCGCA 


TCAACCAGTG 


CCTCAGCTTC 


AGCAAGTACC 


TCAGCATCTG 


AATCAGCATC 


120 


AACAAGTGCA 


TCGGCTTCAG 


CAAGCACAAG 


TGCTTCAGCC 


TCAGCAAGTA 


TCTCAGCGTC 


180 


TGAATCGGCA 


TCAACGAGTG 


CGTCCGCTTC 


AGCAAGTACT 


AGCGCCTCAG 


CATCAGCGTC 


240 


AACAAGTGCT 


TCGGCTTCAG 


CGTCAACGAG 


TGCGTCTGAG 


TCAGCATCAA 


CGAGTACGTC 


300 


AGCCTCAGCA 


AGCACATCAG 


CTTCTGAATC 


TGCATCAACC 


AGTGCGTCAG 


CCTCAGCATC 


360 


GACAAGCGCC 


TCAGCTTCAG 


CAAGTACCAG 


TGCGTCAGCC 


TCAGCAAGTA 


CCAGTGCTTC 


420 


AGCCTCAGCG 


TCGACAAGTG 


CGTCGGCCTC 


AACCAGTGCA 


TCTGAATCGG 


CATCAACCAG 


480 


TGCGTCAGCC 


TCAGCAAGTA 


CTAGCGCCTC 


AGCCTCAGCA 


TCAACGAGTG 


CGTCCGCTTC 


540 


AGCAAGTACT 


AGTGCATCAG 


CATCAGCATC 


AACGAGTGCA 


TCGGCTTCAG 


CAAGTACCAG 


600 


CGCCTCAGCT 


TCAGCAAGCA 


CCAGTGCGTC 


AGCCTCAGCA 


AGTACCAGCG 


CCTCAGCCTC 


660 


AGCAAGCACC 


AGTGCCTCAG 


CTTCAGCAAG 


TACCAGTGCG 


TCAGCCTCAG 


CGTCGACAAG 


720 


TGCGTCGGCT 


TCAGCAAGTA 


CCTCAGCGTC 


TGAATCAGCA 


TCAACGAGTG 


CATCAGCTTC 


780 


AGCATCAACA 


AGTGCTTCAG 


CTTCAGCAAG 


TATCTCAGCG 


TCTGAATCGG 


CATCAACGAG 


840 


TGCGTCCGCT 


TCAGCAAGTA 


CTAGCGCCTC 


AGCATCAGCG 


TCAACAAGTG 


CTTCGGCTTC 


900 


AGCGTCAACG 


AGTGCGTCTG 


AGTCAGCATC 


AACGAGTACG 


TCAGCCTCAG 


CAAGCACATC 


960 


AGCTTCTGAA 


TCTGCATCAA 


CCAGTGCGTC 


AGCCTCAGCA TCGACAAGCG 


CCTCAGCTTC 


1020 


AGCAAGTACC 


AGTGCGTCAG 


CCTCAGCAAG 


TACCAGTGCT 


TCAGCCTCAG 


CGTCGACAAG 


1080 


TGCGTCGGCC 


TCAACCAGTG 


CATCTGAATC GGCATCAACC AGTGCGTCAg CCTCAGCAAG 


1140 


TACTAGCGCC 


TCAGCCTCAG 


CATCAACGAG 


TGCGTCCGCT 


TCAGCAAGTA 


CTAGTGCATC 


1200 


AGCATCAGCA 


TCAACGAGTG 


CATCGGCTTC 


AGCAAGTACC 


AGCGCCTCAG 


CTTCAGCAAG 


1260 


CACCAGTGCG 


TCAGnCTCAG 


CAAGTACCAG 


CGCCTCAGCC 


TCAGCAAGCA 


CCAGTGCCTC 


1320 


AGCTTCAGCA AGTACCAGTG CGTCAgCCTC AGCGTCGACA AGTGCGTCGG CTTCAGCAAG 


13B0 


TACCTCAGCG 


TCTGAATCAG 


CATCAACGAG 


TGCATCAGCT 


TCAGCATCAA 


CAAGTGCTTC 


1440 


AGCTTCAGCA 


AGTACCAGTG 


CGTCGGCTTC 


AGCATCAACG 


AGTGCTTCAG 


TCTCAGCGTC 


1500 


AACCAGTGCC 


TCTGAATCAG 


CATCAACAAG 


TGCCTCGGCT 


TCAGCAAGCA 


CCAGTGCGTC 


1560 


GGCTTCAGCA 


AGTACTAGTG 


CATCGGCTTC 


AGCATCGACA 


AGTGCGTCTG 


AATCGGCATC 


1620 


AACGAGTGCT 


TCGGCTTCAG 


CATCAACGAG 


TGCGTCAGCC 


TCAGCAAGCA 


CATCAGCTTC 


1680 


TGAATCTGCA 


TCAACCAGTG 


CGTCCGCTTC 


AGCGTCAACC 


AGTGCGTCGG 


CTTCAGCGTC 


1740 
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GACAAGTGCT TCGGCTTCAG CATCAACGAG TGCGTCGGCC TCAGCAAGCG CAAGTACCTC 1800 
AGCGTCAGct TCCGCCTCAA CCAGTGCGTC GGCTTCAGCA AGCACAAGTG CGTCAGCCTC I860 
AGCAAGTATC TCAGCGTCTG AATCGGCATC AACGAGTGCG TCTGAGTCAG CATCAACGAG 1920 
TACGTCAGCC TCAGCAAGCA CATCAGCTTC TGAATCTGCA TCAACCAGTG CGTCAGCCTC 1980 

AGCATCGACA AGCGCCTCAG CTTCAGCAAG TACCAGTGCT TCAGCCTCAG CGTCGACAAG 2040 

TGCGTCGGCC TCAACCAGTG CATCTGAATC GGCATCAACC AGTGCGTCAG CCTCAGCAAG 2100 

TACTAGTGCA TCAGCTTCAG CATCAACGAG TGCATCGGCT TCAGCATCAA CCAGTGCCTC 2160 

GGCTTCAGCG TCAACCAGTG CGTCAGCTTC AGCAAGTACC AGTGCTTCAG TCTCAGCATC 2220 

AACAAGTGCT TCAGCCTCAG CATCGACAAG TGCCTCGGCT TCAGCAAGCA CATCAGCATC 2280 

TGAATCAGCG TCAACCAGTG CTTCGGCTTC AGCAAGTACC AGTGCTTCAG CTTCAGCATC 2340 

AACCAGCGCC TCGGCCTCAG CAAGCACCTC AGCTTCTGAA TCGGCCTCAA CCAGCGCCTC 2400 

GGCCTCAGCA AGCACCTCAG CTTCTGAATC GGCCTCAACC AGCGCCTCAG CCTCAGCATC 2460 

AACGAGTGCT TCGGCTTCAG CAAGCACAAG CGCCTCGGGT TCAGCATCAA CGAGTACGTC 2520 

AGCTTCAGCG TCAACCAGTG CTTCAGCCTC AGCATCAACA AGTGCGTCAG CCTCAGCAAG 2580 

TATCTCAGCG TCTGAATCGG CATCAACGAG TGCGTCTGAG TCAGCATCAA CGAGTACGTC 2640 

AGCCTCAGCA AGCACCTCAG CTTCTGAATC GGCCTCAACC AGTGCGTCAG CCTCAGCATC 2700 

GACAAGCGCC TCAGCTTCAG CAAGTACCAG TGCTTCAGCC TCAGCGTCGA CAAGTGCGTC 2760 

GGCCTCAACC AGTGCATCTG AATCGGCATC AACGAGTGCG TCAGCCTCAG CAAGTACTAG 2820 

t TGCATCGGCT TCAGCATCAA CCAGTGCCTC GGCTTCAGCG TCAACCAGTG CGTCAGCTTC 2880 

AGCAAGTACC AGTGCTTCAG TCTCAGCATC AACAAGTGCT TCAGCCTCAG CATCGACAAG 2940 
TGCCTCGGCT TCAGCAAGCA CATCAGCATC TGAATCAGCG TCGACAAGCG CCTCAGCTTC . 3000 

AGCAAGTACC AGTGCGTCAG CCTCAGCGTC GACAAGTGCG TCAGCCTCAG CAAGTACTAG 3060 

TGCATCAGCT TCAGCATCAA CGAGTGCATC GGCTTCGGCG TCAACCAGTG CATCAGAGTC 3120 

AGCAAGTACC AGTGCGTCAg CTTCCGCATC AACAAGTGCC TCGGCTTCAG CAAGCACCAG 3180 

TGCCTCGGCT TCAGCAAGTA CTAGCGCCTC AGCCTCAGCC TCAACCAGTG CGTCAGCCTC 3240 

AGCAAGTATC TCAGCGTCTG AATCGGCATC AACGAGTGCG TCCGCTTCAG CAAGTACTAG 3300 

CGCCTCAGCC TCAGCGTCAA CAAGTGCATC GGCTTCAGCG TCAACCAGTG CGTCTGAATC 3360 

GGCATCAACG AGTGCGTCCG CTTCAGCAAG TACTAGCGCC TCAGCCTCAG CGTCAACAAG 3420 

TGCATCGGCT TCAGCATCAA CGAGTGCGTC CGCTTCAGCA AGTACTAGCG CCTCAGCCTC 3480 

AGCGTCAACA AGTGCATCGG CTTCAGCGTC AACGAGTGCG TCTGAGTCAG CATCAACGAG 3540 
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TGCGTCAGCC TCAGCAAGCA CATCAGCTTC TGAATCTGCA 


TCAACCAGTG 


CGTCAGCCTC 


3600 


AGCATCGACA AGCGCCTCAG CTTCAGCAAG TACCAGTGCG 


TCAGCCTCAG 


CGTCGACAAG 


3660 


TGCGTCGGCT TCAGCAAGTA CCAGTGCGTC AGCCTCAGCA AGTACCAGTG CGTCAGCCTC 


3720 


AGCGTCGACA AGTGCGTCGG CCTCAACCAG TGCATCTGAA TCGGCATCAA CCAGTGCGTC 


3780 


AGCCTCAGCA AGTACTAGTG CATCAGCTTC AGCATCAACG AGTGCATCGG CTTCAGCATC 


3840 


AACCAGTGCA TCAGAGTCAG CAAGTACCAG TGCGTCAGCT 


TCGGCATCAA 


CAAGTGCCTC 


3900 


GGCTTCAGCA AGTACTAGCG CCTCAGCCTC AGCGTCAACA AGTGCTTCAG CTTCCGCGTC 


3960 


AACCAGCGCC TCGGCCTCAG CAAGTATCTC AGCGTCTGAA TCGGCATCAA CAAGTGCCTC 


4020 


GGCTTCAGCA TCAACGAGTG CATCAGTCTC AGCAAGCACC 


AGTGCGTCGG 


CCTCAGCAAG 


4080 


CACCAGCGCG TCTGAATCCG CATCAACCAG TGCGTCAGCT 


TCAGCAAGTA 


CCTCAGCATC 


4140 


TGAATCAGCA TCAACAAGTG CCTCGGCTTC AGCAAGCACA 


AGTGCTTCAG 


CCTCAGCAAG 


4200 


TATCTCAGCG TCTGAATCGG CATCAACGAG TGCGTCGGCT 


TCAGCAAGTA 


CTAGCGCCTC 


4260 


AGCATCAGCG TCAACAAGTG CTTCGGCTTC AGCGTCAACG 


AGTGCGTCTG 


AGTCAGCATC 


4320 


AACGAGTACG TCAGCCTCAG CAAGCACATC AGCTTCTGAA 


TCTGCATCAA 


CCAGTGCGTC 


4380 


AGCCTCAGCA TCGACAAGCG CCTCAGCTTC AGCAAGTACC 


AGTGCGTCAG 


CCTCAGCAAG 


4440 


TACCAGTGCT TCAGCCTCAG CGTCGACAAG TGCGTCGGCC 


TCAACCAGTG 


CATCTGAATC 


4500 


GGCATCAACC AGTGCGTCAG CCTCAGCAAG TACTAGCGCC 


TCAGCCTCAG 


CATCAACGAG 


4560 


TGCGTCCGCT TCAGCAAGTA CTAGTGCATC AGCTTCAGCA AGTACTAGCG 


CCTCAGCCTC 


4620 


AGCGTCGACA AGCGCCTCAG CTTCAGCAAG TACCAGTGCG 


TCAGCCTCAG CGTCGACAAG 


4680 


TGCGTCGGCT TCAGCAAGTA CCTCAGCGTC TGAATCAGCA 


TCAACAAGTG 


CGTCGGCTTC 


4740 


AGCATCAACG AGTGCATCAG CTTCAGCATC AACAAGTGCT 


TCAGCTTCAG 


CAAGTACCAG 


4800 


TGCGTCGGCT TCAGCATCAA CGAGTGCTTC AGTCTCAGCG 


TCAACCAGTG 


CCTCTGAATC 


4860 


CGCATCAACA AGTGCCTCGG CTTCAGCAAG CACCAGTGCT 


TCGGCTTCAG 


CGTCAACGAG 


4920 


TGCGTCTGAG TCAGCATCAA CGAGTGCGTC AGCCTCAGCA 


AGCACATCAG 


CTTCTGAATC 


4980 


TGCATCAACC AGTGCGTCAG CTTCCGCATC AACAAGCGCC 


TCGGCCTCAG 


CAAGTACAAG 


5040 


TGCTTCAGCC TCAGCATCAA CCAGTGCATC AGCTTCAGCC 


TCAACAAGTG 


CTTCAGCCTC 


5100 


AGCGTCAACC AGTGCCTCGG CTTCAGCAAG TACCAGTGCG TCAGCTTCAG CAAGCACAAG 


5160 


TGCGTCAGCT TCAGCATCAA CCAGTGCTTC GGCTTCGGCA 


TCAACAAGTG 


CCTCAGCATC 


5220 


AGCATCAACG AGTGCGTCAG CCTCAGCAAG TACTAGTGCA TCAGCATCAG CATCAACCAG 


5280 
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TGCATCAGCC TCAGCAAGTA TCTCAGCGTC TGAATCGGCA TCAACGAGTG CATCAGCATC 5340 

AGCATCAACG AGTGCATCGG CTTCAGCGTC AACCAGTGCA TCAGTCTCAG CAAGCACCAG 5400 

TGCGTCGGCT TCAGCATCAA CGAGTGCCTC AGCCTCAGCA AGTATCTCAG CGTCTGAATC 5460 

GGCATCAACG AGTGCGTCAG CCTCAGCAAG TACTAGTGCA TCGGCTTCAG CAAGCACCAG 5520 

TGCGTCGGCT TCAGCATCAA CCAGTGCCTC AGCCTCAGCA AGTATCTCAG CGTCTGAATC 5580 

GGCATCAACG AGTGCGTCAG CCTCAGCAAG TACTAGTGCA TCAGCmTCAG CATCAACGAG 5640 

TGCATCGGCT TCAGCAAGTA CCAGCGCCTC AGCTTCAGCA AGCACCAGTG CGTCAGCCTC 5700 

AGCAAGTACC AGCGCCTCAG CCTCAGCAAG CACCAGTGCC TCAGCTTCAG CAAGTACCAG 5760 

TGCGTCAGcT CAGCATCAAC AAGTGCTTCA GCTTCGGCCT CAACAAGTGC GTCAGCTTCA 5820 

GCATCAACGA GTGCGTCGGC TTCAGCAAGC ACCAGTGCCT CGGCCTCAGC AAGCACCAGT 5880 

GCTTCAGCTT CAGCATCAAC AAGTGCGTCA GCTTCAGCAA GTACATCAGT TTCAAATTCA 5940 

GCAAACCATT CGAACTCACA AGTTGGAAAT ACTTCTGGAT CGACAGGTAA ATCCCAAAAA 6000 

GAATTGCCTA ATACAGGTAC TGAGTCGTCA ATTGGATCTG TGTTACTTGG AGTTCTAGCA 6060 

GCTGTTACAG GTATTGGATT GGTTGCGAAA CGCCGTAAAC GTGATGAAGA AGAGTAAGAC 6120 

AACCTGTAAA GTTAGGCTAA ACTAACTCGC GCACATAAAT CAAGGAGAAA ATTGCTAGTG 6180 

GATGATAAAA TAACAGTCAT TGTACCAGTA TACAATGTGG AAAACTATCT GAGGAAGTGC 6240 

CTAGATAGTA TTATTACTCA AACATATAAA AATATTGAGA TTGTTGTCGT TAATGATGGT 6300 

TCTACGGATG CTTCAGGTGA AATTTGTAAA GAATTTTCAG AAATGGATCA CCGAATTCTC 6360 

TATATAGAAC AAGAAAATGC TGGTCTTTCT GCCGCACGAA ACACCGGTCT GAATAATATG 6420 

TCCGGAAATT ATGTGACCTT TGTGGACTCG GATGATTGGA TTGAGCAAGA TTATGTAGAA 6480 

ACTCTATATA AAAAAATAGT AGAGTATCAG GCTGATATTG CAGTTGGTAA TTATTATTCT 6540 

TTCAACGAAA GTGAAGGAAT GTTCTACTTT CATATATTGG GAGACTCCTA TTATGAGAAA 6600 

GTATATGATA ATGTTTCTAT CTTTGAGAAC TTGTATGAAA CTCAAGAAAT GAAGAGTTTT 6660 

GCTTTGATAT CTGCTTGGGG TAAACTCTAT AAGGCAAGAT TGTTTGAGCA GTTGCGCTTT 6720 

GACATAGGTA AATTAGGAGA AGATGGTTAC CTCAATCAAA AGGTATATTT ATTATCAGAA 6780 

AAGGTAATTT ATTTAAATAA AAGTCTTTAT GCTTATCGGA TTAGAAAAGG TAGTTTATCA 6840 

AGAGTTTGGA CAGAAAAGTG GATGCACGCT TTAGTTGATG CTATGTCTGA ACGTATTACG 6900 

CTACTAGCTA ATATGGGTTA TCCTCTAGAG AAACACTTGG CAGTTTATCG TCAGATGTTG 6960 

GAAGTCAGTC TCGCCAACGG TCAAGCTAGT GGTTTATCTG ACACAGCAAC GTATAAAGAG 7020 

TTTGAAATGA AACAAAGGCT TTTAAATCAG CTATCGAGAC AAGAGGAAAG TGAAAAGAAA 7080 
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GCCATTGTCC TCGCAGCAAA CTATGGCTAT GTAGACCAAG 


TTTTAACGAC 


AATCAAGTCT 


7140 


ATTTGTTATC 


ATAATCGTTC 


GATTCGTTTT 


TATCTGATTC 


ATAGCGATTT 


TCCAAATGAA 


7200 


TGGATTAAGC 


AATTAAATAA 


GCGCTTAGAG 


AAGTTTGACT 


CAGAAATTAT 


TAATTGTCGG 


7260 


GTAACTTCTG 


AGCAAATTTC 


ATGTTATAAA TCGGATATTA 


GTTACACAGT 


CTTTTTACGC 


7320 


TATTTCATAG 


CTGATTTCGT 


GCAAGAAGAC 


AAGGCCCTCT 


ACTTGGACTG 


TGATCTAGTT 


7380 


GTAACGAAAA ATCTGGATGA CTTGTTTGCT ACAGACTTAC 


AAGATTATCC 


TTTGGCTGCT 


7440 


GTTAGAGATT 


TTGGGGGCAG AGCTTATTTT GGTCAAGAAA TCTTTAATGC CGGTGTTCTC 


7500 


TTGGTAAACA ATGCTTTTTG 


GAAAAAAGAG 


AATATGACCC 


AAAAATTAAT 


TGATGTAACC 


7560 


AATGAATGGC ATGATAAGGT GGATCAGGCA GATCAGAGCA TCTTGAATAT 


GCTTTTTGAA 


7620 


CATAAATGGT 


TGGAATTGGA 


CTTTGATTAT 


AATCATATTG 


TCATTCATAA 


ACAGTTTGCT 


7680 


GATTATCAAT 


TGCCTGAGGG 


TCAGGATTAT 


CCTGCTATTA 


TTCACTATCT 


TTCTCATCGG 


7740 


AAACCGTGGA 


AAGATTTGGC 


GGCCCAAACC 


TATCGTGAAG 


TTTGGTGGTA 


CTATCATGGG 


7800 


CTTGAATGGA 


CAGAATTGGG 


ACAAAACCAT 


CATTTACATC 


CATTACAAAG 


ATCTCACATC 


7860 


TATCCAATAA 


AGGAACCTTT 


CACTTGTCTA 


ATCTATACTG 


CCTCAGACCA 


TATTGAACAA 


7920 


ATTGAGACAT 


TGGTTCAATC 


CTTGCCTGAT 


ATTCAGTTTA 


AGATAGCAGC 


TAGAGTAATA 


7980 


GTTAGTGATC 


GATTGGCTCA 


GATGACAATT 


TATCCAAACG 


TGACTATATT 


TAACGGAATT 


8040 


CACTATTTGG 


TAGATGTCGA 


TAATGAATTG 


GTAGAAACCA 


GTCAAGTACT 


TTTAGATATT 


8100 


AATCATGGCG 


AAAAGACAGA 


AGAAATTCTC 


GATCAATTTG 


CTAATCTTGG 


CAAGCCTATC 


8160 


TTATCCTTTG AAAATACTAA AACCTATGAA GTAGGTCAGG AGGCATATGC 


TGTTGACCAA 


8220 


GTTCAAGCAA TGATTGAAAA ATTGAGAGAA ATAAGCAAAT GAAGAAAAAT 


CATTTAGTAG 


8280 


GAGATGCTCT 


GATTTTGACG 


GTTAGTGATC 


AGATTGAAGA 


GTTGGATTAT 


TTTTTATAAA 


8340 


ATTTCTCCGT 


TCATCATATA 


TGAAAGTTGT 


TCAAACATCA 


GAGTGCTTTA 


TAAAATATAA 


8400 


ATAGACCTAA 


AGATATTTAA 


TATGAACTGC ACCCCAAAAG 


TTAGACAGAA 


AAAATCTAAC 


8460 


TTTTTGGsGT 


CAGTACAATA 


TTAGGGTGTG 


ATTAATTATC 


TTTTTAGGTG 


AAAATGATTC 


8520 


TATATTATAG 


CTGTTTGATA 


CGAAATTTAT 


TATAAGGAAA 


TTATGTTAAT 


GAATACAAAA 


8580 


TCTATAGTTT 


TTAATGCAGA TAATGATTAT 


GTAGATAAAT 


TAGAAACTGC 


AATTAAATCT 


8640 


ATTTGTTGTT 


ATAATAATTG 


TTTAAAATTT 


TATGTATTTA 


ATGATGATAT 


TGCGTCAGAG 


8700 


TGGTTTTTGA TGATGAATAA GCGATTGAAG ACTATACAAT CTGAAATCGT TAATGTAAAG 


8760 


ATTGTAGATC 


ATGTTCTTAA 


AAAGTTTCAT 


TTACCGTTAA 


AGAATTTAAG 


TTATGCCACT 


8820 
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TTCTTTCGTT ATTTTATACC TAATTTTGTC AAAGAAAGTC GTGCTTTATA CCTAGATTCT 8880 

GACATCATTG TTACAGGAAG TTTAGACTAT TTATTTGATA TAGAACTAGA TGGTTATGCC 8940 

TTGGCAGCAG TAGAAGATTC TTTTGGTGAT GTTCCTTCTA CCAATTTTAA CTCCGGAATG 9000 

TTATTAGTTA ATGTAGATAC TTGGAGAGAT GAAGATGCTT GTTCGAAACT GTTAGAACTG 9060 

ACCAATCAAT ATCATGAAAC AGCATATGGA GATCAAGGAA TTTTAAATAT GTTATTCCAT 9120 

GATAGATGGA AAAGATTAGA CCGAAATTTT AATTTTATGG TGGGGATGGA TAGCGTCGCA 9180 

CACATAGAAG GAAATCATAA ATGGTATGAG ATTTCTGAGT TGAAAAATGG AGATTTACCT 9240 

AGTGTTATAC ATTATACTGG GGTAAAACCT TGGGAAATAA TTTCCAATAA TCGCTTTAGA 9300 

GAAGTTTGGT GGTTTTATAA TCTGTTAGAA TGGTCTGATA TTTTATTGAG AAAAGACATT 9360 

ATTAGTCGTA GTTTCGAAGA ACTTGTATAC AGTCCTAAAG CTCATACAGC AATTTTTACA 9420 

GCTAGTTGTG AGATGGAGCA TGTAGAATAT TTGATAGAAA ATTTACCAGA GGTACATTTT 9480 

TCTATACTAG CACATACATA TTTTGCGTCT AGTGTCGTTG CTTTATTAAG ATATAGCAAT 9540 

GTTACGATTT ATCCTTGTTT TTCTCCATTT GATTATCGAA AAATTTTGGA TAATTTAGAT 9600 

TTTTATTTAG ATATTAATCA TTATAAAGAA GTGGATAATA TTGTATCCGT TGTTCAACAA 9660 

CTATCTAAAC CAATTTTTAC CTTTGAAAAT ACTAGTCATG ATATAGGCAA TGAAACTAAT 9720 

ATATTTTCTT CAACCGAACC AAACAAAATG GTAGAGGCTA TTAGACAATT TATAGGAGAA 9780 

TAAGTTTATG GCAGACGAAC TAATTAGTAT TGTAGTTCCA ATCTACAACG TTGAGAATTA 9840 

TTTGCGAATG TGTTTGGATA GCATTCAGAA TCAGACGTAT CAAAATTTTG AGTGTTTATT 9900 

AATCAATGAT GGCTCTCCAG ATC ATT CATC CAAAATATGT GAAGAATTTG TAGAGAAAGA 9960 

TTCTCGTTTC AAATATTTTG AGAAAGCAAA CGGCGGTCTT TCATCAGCTC GTAACCTAGG 10020 

TATTGAATGT TCGGGGGGGG GCGTACATTA CTTTTGTAGA CTCTGATGAT TGGTTGGAAC 10080 

ATGATGCTTT AGACCGATTA TATGGTGCTT TGAAAAAGGA AAACGCAGAT ATTAGTATCG 10140 

GGCGTTATAA TTCTTATGAT GAAACACGCT ATGTGTATAT GACTTATGTT ACGGATCCAG 10200 

ATGATTCTCT AGAAGTGATA GAAGGTAAAG CAATTATGGA TAGGGAAGGT GTCGAAGAAG 10260 

TCAGAAATGG GAACTGGACT GTAGCTGTCT TGAAGTTATT CAAGAGAGAG TTACTACAAG 10320 

ATTTACCATT TCCTATAGGA AAAATTGCAG AGGATACTTA CTGGACATGG AAGGTACTTC 10380 

TAAGAGCTTC GAGGATAGTC TATTTGAATC GTTGTGTTTA CTGGTACCGT GTTGGTTTAT 10440 

CTGATACTTT ATCGAATACA TGGAGTGAAA AGCGTATGTA TGATGAAATT GGGGCTAGGG 10500 

AAGAAAAGAT AGCTATTTTA GCAAGTTCAG ACTATGACTT GACCAATCAT ATTTTGATTT 10560 

ATAAAAATAG ATTACAAAGA GTGATAGCAA AATTAGAAGA ACAAAATATG CAGTTCACAG 10620 
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AGATTTACAG 


AAGAATGATG 


GAAAAATTGT 


CTTTACTTCC 


GTAGATAGTA ATAAAAAATG 


'10680 


AGATAGCGTA 


ATATGAAACT 


ACATTTAACA 


AATTTATACG 


GCATGGCTGG 


TGATAGTACG 


10740 


GTTATCTTAG 


CTCAAAATGC 


TGTTCAAAAG 


ATAGCTAGTC 


AACTGGGATT 


TAGAGAGGTT 


10800 


GGTATTTATT 


TTTACAACAT 


TGCTTCAGAT 


AGTCCTTCTG 


AAATGAATAA 


GCGTCTGGAT 


10660 


GGTATTATGG 


CCAGTATCTC 


TATTGGGGAT 


ATTTTAGTCT 


TTCAGTCTCC AACCTGGAAT 


10920 


GGTTTTGAAT 


TTGATCGTCT 


CTTGTTTGAT 


AAGCTAAAGG 


ATATGCAGGT GAAAATTATT 


10980 


TGCTTTATCC 


ATGATGTTGT 


TCCCCTCATG 


TTTGATAGTA 


ACTATTATCT 


CATGAAAGAT 


11040 


TATCTGTATA 


TGTATAATCT 


ATCAGATGTT 


TTGATAGTGC 


CGTCAGAGAG 


AATGAAAACA 


11100 


CGCCTGATGG 


AAGAAGGATT 


GACGACTAAG 


AAGATTCTTG 


TTCAAGGGAT 


GTGGGATCAT 


11160 


CCTCATGATT 


TATCCTTATA 


CACCCCTGCT 


TTTAAAAAAG 


AACTTTTTTT 


TGCTGGAAGT 


11220 


TTAGAGCGTT 


TTCCAGACTT 


ACAAAATTGG 


TCTCAAGATA 


CGCCTTTGAG 


AGTATTTTCA 


11280 


AATAAAGGGG 


AAGCTAGTTC 


TAGTGCTAGA 


AGTCTCAGCA 


TCGAAGGATG 


GAAAAAAGAT 


11340 


GAGGAATTGT 


TGCTAGAATT 


ATCAAAGGGT 


GGATTTGGCC 


TTGTCTGGGG 


AACCcATCAA 


11400 


AATGAGGGAG 


AAAGTAACCA 


ATACTATACC 


TTGAATATAT 


CTCATAAGGT 


GAGTACCTAT 


11460 


CTAACAGCGG 


GCATTCCAGT 


CATTGTACCA 


AGTAGCTTGT 


CAACTGCTAA 


ATTTATAGTA 


11520 


GATCAAGGCT 


TGGGCTTTAT 


GGCGGATAGT 


CTGGAAGAGG 


TTCATGAGAT 


AGTTGATAAA 


11580 


ATGAATCTAC 


AAGAATATCA 


AGAAATGACG 


AATCGTATCA 


AGACCTTTAG 


CTATTTGTTA 


11640 


AAAGAGGGCT 


ATTTCACTAA 


AAAGTTATTG 


GTAGATGCAA 


TCTATCACTT 


GGGAATTGAT 


11700 


TAAGGGAATG 


AAATGAACAA 


AACAATTGTA 


CTAGCAGGGG 


ATCGCAATTA 


CACCAGGCAG 


11760 


TTAGAAACAA 


CGATAAAATC 


TATTTTATAC 


CACAATCGAG 


ATGTTAAGAT 


TTATATTTTG 


11820 


AATCAAGATA 


TCATGCCAGA 


TTGGTTTCGC 


AAACCACGAA 


AAATAGCTCG 


CATGTTAGGT 


11880 


AGTGAGATTA 


TCGATGTTAA 


ACTACCTGAA 


CAAACTGTGT 


TTCAAGATTG 


GGAAAAGCAA 


11940 


GATCAGATTA 


GTAGCATTAC 


TTATGCTAGA 


TATTTTATTG 


CAGATTATAT 


CCAAGAAGAT 


12000 


AAGGTTTTAT 


ATTTAGACAG 


TGATTTGATT 


GTAAATACTT 


CTTTAGAGAA 


ATTATTTAGT 


12060 


ATTTGTTTAG 


AAGAAAAATC 


ACTCGCAGCA 


GTTAAAGATA 


CAGATGGAAT 


TACATTTAAT 


12120 


GCAGGTGTTT 


TATTAATCAA 


CAATAAAAAA 


TGGCGTCAAG 


AGAAATTAAA AGAACGACTA 


12180 


ATTGAACAGA 


GCATTGTTAC 


AATGAAGGAA 


GTTGAAGAAG 


GCCGTTTCGA 


GCATTTTAAT 


12240 


GGTGATCAAA 


CGATTTTTAA 


TCAGGTCTTG 


CAAGATGATT 


GGTTAGAACT 


AGGTCGAGCT 


12300 


TATAATTTAC 


AAGTAGGGCA 


TGATATTGTG 


GCTTTGTATA 


ACAATTGGCA 


GGAACATCTG 


12360 
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GCTTTTAATG ATAAACCAGT GGTGATTCAT 


TTTACGACCT 


ACAGAAAACC 


CTGGACTACC 


12420 


TTGACAGCCA 


ATCGTTATCG 


TGATTTATGG 


TGGGAATTCC 


ATGATTTGGA 


GTGGAGTCAG 


12480 


ATTTTACAAC ACCATATGGG AGAATTTGAA 


CTAATATCGC 


CTCTAGATAA 


GGAATTTTCT 


12540 


TGCTTAACCT 


TAACGAATTC 


CCAAGATTTA 


GAAGGAATAG 


AAGAGCTAGT 


TACAGCTCTA 


12600 


CCTGAGGTGG TATTTCATAT CGCAGCTTGG ACGGATATGG 


GAGATAAATT 


AAAAAAATTA 


12660 


GCTGTATATA ATAATGTGAG ATTGCATCCA 


CAAATTGTTC 


CACCGGTCTT 


AGATAAGCTG 


12720 


AAAAAGTCAA 


CAAATCTATA 


TTTGGATATC 


AATCATGGTA 


GTGCAGATGA 


GAACTTTTTA 


12780 


AAATCTTTGC 


AAGAACAAGA 


AAAAACGCTA 


CTAGCTTTTC 


AATCGACTCA 


GCACGGAGAG 




TTAGGACAAA 


TCGTTTTCGA 


AAATGGGAAA 


GTTTCCTTTA 


TGATTGATAC 


GATTAAAGAT 




TTTAAGAAAA 


ACGGACATCT 


TACCTGTTTT 


CGACAACTTC 


CAAGTTTAAC TTGTTTAACG 




TTTACGGCTT 


CTCAGTATAT 


CGAACAATTG 


GATTACTTGG 


CTGGACAGTT 


GCCAAATGTT 




GTTTTTCAAA 


TTGCTGCTTG 


GACAGCTATG 


GGGCCAAAAT 


TATATGATTT 


GTCTAATCGT 


i men 


TATCCTAATA 


TTCAGCTCTA 

i 


TCCGGCAATT 


TCTAGAGATA 


AGCTAGACGA 


GTTGAAGGAG 


i »i/irt 
1JJ.4U 


AAGATGGATG 


CTTATTTAGA 


TATCAACCTA 


CTGACTTCAA 


CATCCGATAT 


CGTTGCAGAA 


13200 


ATGGCTCATC 


TATCTAAACC 


TATACTAGCC 


TTTTATAAAT 


CTCAAAATGG 


GAATAATGGC 




CAAAGGTTGT 


ATTCAAGTGA 


ACATCCTGAA 


CGAATGTTGG 


CTGATTTGCA 


AAAATTGATA 




ACTAAGGATA TGCTAGAAAA ACCGCTTGAT ATAATCCAGG 


TGAAAGGGAT 


AGATGAAACC 


■LjJOU 


TTGGATTATA 


TTATTGAACA 


CAACTCTTCT 


TTAGTTCGTT 


TTGGAGATGG 


GGAAATCAAT 


13440 


ATGCTTGCAG 


GGCATTCAAT 


TCCCTACCAG 


GATTATGATG 


AAGAGTTGGT 


TTCAATCATG 


13500 


AGGGACATTA 


TCGGCCAAGA 


AAGTCGAGAA 


GATTTAGTAG 


TGTGCCTTCC 


TGATGCTTTT 


13560 


ACAGATCGTT 


TTAGGTTTAC 


ATCGTGGGCG 


ATTCCATTTT 


GGAAAGATCA 


CATGGATCAT 


13620 


TATATGGATT 


TTTACAGAGA 


GTTATGCAGT 


GATTCATGGT 


ATGGCTCAAC 


CTTTGTATCT 


13680 


CGCCCTTATA TCGATTTTGA AGACAAGAGT CAAGCTAAAG CTCAATTTGA 


AAAATTGAAA 


13740 


AGCATTTGGG 


AAAACCGTGA 


CTTACTGATA 


GTCGAAGGTG 


CGACTTCTCG 


TTCAGGTGTC 


13800 


GGAAATGATT 


TATTCGATGA 


GGCAAATTCT 


ATTAAGCGAA 


TTATCTGTCC 


TTCTCATAGT 


13860 


GCCTTTTCTA 


GAGTTCATGA 


ACTTGAACAA 


GAAATTGAAA 


AGTATGCTGG 


TGGTCGCTTG 


13920 


ATTTTATGTA TGCTTGGACC 


TACAGCAAAA 


GTTCTGAGTT 


ATAATCTATG 


CCAGATGGGC 


13980 


TATCAAGTTT 


TGGATGTAGG 


CCATATTGAC TCAGAGTATG AATGGATGAA AATGGGAGCT 


14040 


AAAACTAAGG 


TTAAATTTTC 


TCATAAACAT 


ACTGCAGAAC 


ATAATTTCGA 


CCAAGATATT 


14100 


GAATTTATTG 


ATGATGAAAC 


CTATAACAGT 


CAGATTGTTG 


CACGAATATT 


AAACTAGACT 


14160 
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ATTTAAAATA 


AATGATAAGG ATTTAAAATG AGAAATACCA AACGCGCT.GT AGTATTTGCA- 


14220 


GGTGATTACG CTTATATTCG ACAAATCGAA ACGGCGATGA 


AGTCACTCTG 


TAGACACAAT 


14280 


AGTCATTTGA 


AAATTTATCT GCTAAATCAG GACATTCCTC 


AGGAATGGTT 


TAGTCAAATA 


14340 


AGAATATATT 


TACAAGAGAT GGGGGGCGAC TTGATTGACT GCAAGTTAAT TGGCTCACAG 


14400 


TTTCAAATGA 


ATTGGTCTAA TAAATTACCT CATATCAATC 


ATATGACATT 


TGCACGCTAT 


14460 


TTTATTCCAG 


ATTTTGTAAC AGAAGATAAA GTTCTCTATC TAGATAGTGA TTTGATTGTG 


14520 


ACTGGTGATT 


TGACCGATTT GTTTGAATTA GACTTAGGTG AAAATTATTT 


GGCAGCAGGT 


14560 


CGTTCTTGCT 


TTGGAGCAGG AGTCGGCTTC AATGCTGGTG TTCTCTTGAT 


TAACAACAAA 


14640 


AAATGGGGAT 


CTGAAACTAT TCGACAAAAA TTGATTGACT 


TAACAGAAAA 


AGAACATGAG 


14700 


AATGTGGAAG 


AAGGAGACCA GTCAATTTTG AATATGTTGT TTAAAGATCA 


ATATAGTTCC 


14760 


CTTGAAGATC 


AATATAATTT TCAAATAGGA TATGATTATG GGGCGGCAAC 


CTTTAAACAT 


14820 


CAATTCATTT 


TTGATATTCC GCTCGAACCA CTGCCACTAA 


TTTTACACTA 


TATTTCTCAG 


14880 


GATAAGCCTT 


GGAATCAATT TTCTGTTGGA CGTCTAAGAG AAGTTTGGTG GGAATACTCT 


14940 


TTGATGGATT 


GGTCTGTTAT TTTAAATGAA TGGTTTTCAA 


AGAGTGTGAA 


GTACCCTAGT 


15000 


AAATCACAAA TATTTAAGTT GCAATGTGTT AATTTAACGA ATTCTTGGTG 


TGTCGAGAAA 


15060 


ATCGATTATT 


TGGCGGAGCA ATTGCCAGAA GTTCATTTTC ATATTGTTGC TTATACAAAT 


15120 


ATGGCAAATG 


AACTACTAGC TTTAACGCGT TTTCCTAATG 


TTACCGTATA 


TCCAAATTCC 


15180 


TTACCAATGT TATTGGAACA AATAGTAATA GCTTCAGATT 


TGTATTTGGA 


TTTGAATCAT 


15240 


GATCGAAAAT 


TAGAAGATGC ATATGAGTTT GTGCTTAAGT 


ACAAAAAACC 


AATGATAGCT 


15300 


TTCGACAAT A 


CTTGCTCTGA AAATCTTTCT GAGATTTCAT 


ATGAAGGTAT 


CTATCCAAGC 


15360 


TCCATTCCGA 


AAAAAATGGT TGCAGCAATC AGATCTTACA 


TGAGGTAGAG 


AACAGTATGA 


15420 


GAAAATCAAT 


AGTATTAGCG GCAGATAATG CCTATCTTAT 


TCCTTTAGAG 


ACGACTATAA 


15480 


AGTCTGTATT 


GTATCACAAT AGAGATGTTG ATTTTTATAT 


TCTCAACAGT 


GATATAGCTC 


15540 


CTGAATGGTT 


TAAATTATTG GGGAGAAAAA TGGAAGTTGT 


GAATTCTACA 


ATTCGCAGTG 


15600 


TACACATTGA 


TAAAGAACTT TTTGAAAGCT ATAAAACAGG 


ACCTCATATA 


AATTATGCTT 


15660 


CTTACTTTAG 


ATTTTTTGCG ACAGAAGTGG TTGAATCTGA TAGGGTATTG TATCTGGATT 


15720 


CCGATATCAT 


TGTAACTGGG GAACTAGCTA CTTTGTTTGA GATAGATCTC AAAGGATATT 


15780 


CAATTGGTGC TGTTGATGAT GTCTATGCCT ATGAAGGACG AAAATCTGGA 


TTTAATACTG 


15840 


GTATGTTACT 


AATGGATGTT GCAAAGTGGA AAGAACATTC 


TATTGTCAAT 


AGTTTATTGG 


15900 
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AATTAGCGGC CGAGCAGAAT CAAGTTGTTC ATCTTGGGGA TCAGAGTATT 


TTAAATATTT 


15960 


ATTTTGAGGA TAATTGGCTA GCCTTAGATA AAACATATAA 


TTATATGGTG 


GGTATTGATA 


16020 


TTTATCACCT TGCTCAAGAA TGTGAACGTC TAGATGACAA TCCACCTACA 


ATTGTTCACT 


16080 


ATGCTAGTCA TGATAAACCT TGGAATACAT ATAGTATATC 


TAGACTACGT 


GAATTATGGT 


1 fil Aft 


GGGTTTATAG AGATTTGGAT TGGTCAGAGA TTGCTTTTCA ACGTTCCGAT 


TT AA ATT A TT 


XOZ UU 


TTGAAAGAAG CAATCAGTCT AAAAAACAAG TGATGCTTGT 


GACATGGAGT 


GCAGAT AT A A 

Uwlvn X t\ x 




AACATTTAGA GTATTTAGTA CAACGGTTAC CTGATTGGCA 


TTTTCATTTG 




163 20 


GTGATTGTTC TGAGGAGCTG ACCTCTCTAT CACAGTATAC 


GAATGTAACA 


vi 1 r\ 1 1\ 1 LAAA 


16380 


ATGTATTACA TAGTAGAATT GATTGGCTAT TGGACGATTC 


TATAGTTTAT 


r 

TT A CI A T A fwp a 


16440 


ATACAGGTGG AGAGGTTTTT AATGTAGTTA CAAGGGCACA AGAAAGTGGC 


AAla AAAA L\. 1 


16500 


TCGCTTTTGA TATCACACGT AAAAGTATGG ATGATGGACT 


CTATGACGGT 


ATTTTTTCTG 


16560 


TGGAGAGACC AGATGATTTA GTGGATAGAA TGAAGAATAT 


AGAGATAGAG 


X Aft 1 liAtj 1\jA 


16620 


ATTAATTAGT GTTGTGGTAC CGATATACAA TACGGGAAAA 


TATTTAGTGG 


Ala Ivs 1 Vj i LOA 


16680 


GCATATTCTG AAGCAAACCT ATCAAAATAT AGAAATTATT 


TTAGTTGATG 


nUAj X 1 1_ 1 AC 


16740 


GGATAATTCT GGGGAAATTT GTGATGCTTT TATGATGCAA 


GATAATCGTG 


UAL* I A i r 


16800 


GCATCAAGAA AATAAGGGGG GGGCAGCACA AGCTAAAAAT 


ATGGGGATTA 




16860 


GGGAGAGTAC ATCACGATTG TTGATTCAGA TGATATCGTA 


AAAGAAAATA 


lun 1 1 uAAAL 


16920 


TCTTTATCAG CAAGTCCAAG AAAAGGATGC AGATGTTGTT 


ATAGGGAATT 


n^lnlAAl 1A 


16980 


TGACGAAAGT GACGGGAATT TTTATTTTTA TGTAACAGGG 


CAAGATTTTT 




17040 


ATTAGCTATA CAAGAAATTA TGAACCGTCA AGCAGGAGAT 


TGGAAATTCA 




X J X uu 


CTTTATATTG CCGACATTTA AGTTGATTAA AAAAGAATTA 


TTCAATGAAG 


TTC ACTTTT C 


i *?1 An 

J. / lOU 


AAATGGTCGC CGCTTTGATG ATGAAGCAAC TATGCATCGC 


TTTTATCTTT 


TAGCCTCTAA 


17220 


AATCGTCTTT ATAAACGATA ATCTCTATCT GTATAGAAGA 


CGTTCAGGAA 


GCATCATGAG 


17280 


AACGGAATTT GATCTTTCCT GGGCAAGAGA TATTGTTGAA GTGTTTTCTA 


AGAAAATATC 


17340 


GGATTGTGTC TTGGCTGGTT TGGATGTCTC CGTTCTGCGT 


ATTCGATTTG 


TCAATCTTTT 


17400 


AAAAGATTAT AAGCAAACTT TAGAATACCA TCAATTAACA 


GATACTGAGG 


AATATAAAGA 


17460 


TATTTGTTTC AGATTAAAGT TGTTTTTTGA TGCAGAACAA 


AGAAATGGTA 


AAAGTTGAAA 


17520 


TAAAAGAATT GTTATTTACC ATATCACAAA CAATGAAGGT 


GAGGGGAGTG 


TTTTATGACT 


17580 


AAGATTTATT CGTCAATAGC AGTAAAAAAA GGACTATTTA 


CCTCATTTCT 


ACTGTTTATC 


17640 


TATGTATTGG GAAGTCGTAT TATTCTCCCT TTTGTTGACC 


TAAATACTAA 


AGATTTTTTA 


17700 
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GGAGGTTCAA CAGCCTATCT AGCCTTCTCA 


GCCGCCCTAA 


CAGGTGGGAA 


TCTAAGAAGT 


17760 


TTATCAATTT TTTCTGTTGG ATTATCCCCT 


TGGATGTCCG 


CCATGATTTT 


ATGGCAGATG 


17820 


TTTTCTTTTT CTAAACGGTT GGGTTTAACA 


TCTACGTCTA 


TAGAAATACA 


AGATCGCCGT 


17880 


AAAATGTACC TGACCTTGCT AATTGCTGTG 


ATTCAATCCT 


TGGCAGTTAG 


CTTGAGACTG 


17940 


CCAGTACAAT CCTCCTATTC TGCAATATTG 


GTTGTTCTAA 


TGAATACAAT 


ATTGCTGATA 


18000 


GCAGGAACAT TTTTTCTTGT TTGGTTGTCA 


GATTTAAATG 


CGAGTATGGG 


GATTGGAGGT 


18060 


TCTATTGTAA TCCTCCTATC CAGTATGGTT 


TTAAATATTC 


CTCAGGATGT 


TTTGGAAACA 


18120 


TTTCAGACAG TACACATTCC AACAGGGATT 


ATTGTGTTAC 


TTGCTTTATT 


AACCCTTGTC 


18180 


TTTTCTTATT TACTTGCCCT TATGTATCGA 


GCTCGCTATT 


TGGTTCCTGT 


TAATAAAATT 


18240 


GGCTTACACA ATCGATTTAA ACGCTATTCT TATCTCGAAA TCATGTTGAA 


TCCTGCAGGT 


18300 


GGGATGCCTT ATATGTATGT GATGAGTTTT CTTAGTGTAC 


CAGCTTATTT 


GTTCATCTTG 


18360 


TTGGGATTTA TTTTCCCTAA TCATTCAGGG 


TTAGCGGCTT 


TATCAAAGGA 


ATTTATGGTT 


18420 


GGAAAGCCTT TGTGGGTCTA TGTTTATATT 


TCGGTCTTAT 


TTTTATTTAG 


TATCATTTTT 


18480 


GCTTTTGTTA CGATGAATGG AGAAGAGATT 


GCAGACCGTA 


TGAAAAAATC 


TGGAGAATAC 


18540 


ATTTATGGTA TTTATCCAGG TGCGGATACT AGTCGATTTA TTAATCGATT GGTCCTTCGT 


18600 


TTCTCAGTCA TAGGTGGTCT CTTTAATGTG ATTATGGCAG GTGGTCCCAT 


GCTTTTTGTT 


18660 


TTGTTTGATG AAAAGTTATT ACGATTGGCA 


ATGATTCCAG 


GCTTATTTAT 


GATGTTCGGG 


18720 


GGCATGATTT TTACGATTAG AGACGAGGTC AAGGCTTTAA GGCTAAATGA GACCTATAGA 


18780 


CCTTTGATTT AGGAGACTTT TATGTATTAT 


TTTATTCCAG 


CTTGGTATGG 


GTCAGAAAGA 


18840 


ACATGGCATG CAGATATCAC TCCATGGTAT 


TTTTCTCATT 


TTCGTCTAGA 


GTTTGATGAT 


18900 


ACCTTTCACC AGATTCGGCT CTTTCAAGAG 


CAAGATATAG 


ATTCTCGTCT 


ATTAGTATTA 


18960 


GCTTACCAGC CTCATCTACG TTATTTTTTA 


TATAGACATG 


GTGTGTTAGA 


AATGGATACT 


19020 


TATTCCGTTT TTGATGTTAT GCAAGATTTT CATAATCTCC 


ATACCCAAGT 


TTTAAGCATT 


19080 


AGAGATATTG AGTGGGATGA TGACTGTGAA 


TTTATTTATA 


GTCCCTTTAC 


GATTATCGTT 


19140 


CAAAAAAATG GGAAGAAATT TGCTAAGGTT GAACATGGAG TTGAAGGCTT 


CATCAGTGAT 


19200 


ATACAGTATT TTGAACCAAA TGGTCAAATA 


CATATGCACC 


ATATCGTGGA 


TGATCGTGGG 


19260 


TTTGTATCGA GCATTATCTT TTTTGAAGAT 


GGGCAAGCAG 


CCTATCAAGA 


ATATCTGAAC 


19320 


CTCAAGGGAG AGTGGCAATT CAGAGAGCGT 


TTAAAAGAAG 


GAGGACAGGT 


AGAAGTCAAT 


19380 


CCAATTTTGG GTTATCGCTT TAAAATGCTT 


ACCTATCAAA 


ATATGGGAGA 


TCTGGTGGCA 


19440 
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GAATTTTTTG AGAATTATCT GCAAACGTAT GTGAAGGATC AGGATATTTT TATGCTTCCT 19500 

TCTCATTCTC ATCATGACCA GTTGGTACTA GATCGTTTAC CTAGTACTAA TCCTAAACTG 19560 

TTGAGTCTGT TCATTGGACG TAATCCTCAA GATACCTTTA GGGATTTAGA TGTAACTTTT 19620 

GAAAAATCGG ATTTGATTTT GGTGGATAGA GAGGATAGTT TACGATTGTT GCAGGAGTTG 19680 

TATCCTGAAC GAATGCATCA ATGTTATCAT TTATCATCTT TTGACACCCG ATTACGATTG 19740 

GGACGAAGCC AAACTAAGAA AGAATCCATC ATTTATTTTC AACTGGATTT TGAGCAGGGG 19800 

ATTGATAATC AAGCTCTGCT TCAAGTCTTG TCCTTTGTCG CTGAAAATAA GGATACTGAG 19860 

GTGATTTTTG GAGCCTTTGC TGCTAGTCAG GAGCAAATGA ATGAGGTTGA AGGGATTGTT 19920 

GAGTCTTTCA TCCAAGAAAA CATTCAATCC GAAAATCTGG GAAAGGCGAT TGATTATGGT 19980 

GATGCAGAAA ATCCTCTGGA AGAAAATCAA CACCAGGACT TACGGTTACA GTTTGTTAAC 20040 

TTGAATGATG AGTTAGATTT GATAAAAACA CTAGAATTTG TCCGTTTGAT TGTGGATTTA 20100 

AATAGACATC CTCATCTCTA CACACAGATT GCTGGGATTA GTGCAGGAAT TCCTCAAATC 20160 

AACCTAGTTG AAACCGTCTA TGTTGAACAT TTAAAAAATG GTTATTTGTT AGCAGATGTT 20220 

ACAGAATTTT CTAAGGCTGC ACATTATTAC ACAGATAGGT TGAAGGAGTG GAATGAGTCC 20280 

TTGATATATT CAATTGATAA GATTAAGGAG CACACAGGAC AACAATTTCT TGGAAAATTA 20340 

GAGAAATGGA TAGAGGAGGT TAAAAATGTC AAAGGAACTT AATATTTTAC AGATAGGACT 20400 

TGCCAATTGG GAAAATCACT ATGACATACC TGAAAATATG AGTTGGTATT ATTTTTACCC 20460 

AAACTCATCA AAAGCCCTTC GTGAAATAAT TGAAAAAGAG GATATTAACC GTTTTCATGC 20520 

AGTTTTAATA GAAGATGGTC AGTATTCCAG AGACTTATTT TCCTATGTAA AATATTTTGA 20580 

ACCTTATACT TTATTTTATA ACCAGAATCT ACAAATAAAT GATAGAGAGG TTGTGGATTT 20640 

TCTAAAAAAA CGATGTGCAC AAGCAATTGA TTTTTTAAGT CCCCAACAAC TAATCAATGA 20700 

TTTAAGTAAA TCTCTTTTTG GCGGTGGGTA TGGTGATAAA CTCTTTCCTC CGACGATACA 20760 

AGTCAATCCA AATTTTACAG GAGCTATTTC TTATCAAGGA TTGGATTATG TCAGTTTGGA 20820 

AGGTGAGTTT GGGCAAGATT TTGCCCAGCT TGCCTATTGG GCTTATAATA TTATGGTGCA 20880 

AAAAACACTC CCTATTGAGT TGTGGCTTGA ATATGAGAAG GAAGGCAATT GTGACTTTCG 20940 

TTTAGTAATC CGTAAAATGT GGAGTGGGTC TGTTGATGAT TTCTTTGAAG AAGTAATAGT 21000 

ATCTGAAAAA GACTTGGAGC AAGCACTTTT TATGGATAGT CGAGACGGAG ACTACTTTCT 21060 

CTCGATATCT GTTGAAGCAA GAGGTCGTGG AACTATCAAA CTAGGTAATC TTCACCAACG 21120 

ATGGAGTCGA AAACAATTTG GTAAGTTTGT ACTTGGTGGA AATATCCTAC ATGATTCCAA 21180 

GCGTGATGAA ATAAACTATT TCTTCCATCC AGGTGATTTT AAACCGCCTT TGACTGTCTA 21240 
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TTTTGCAGGT TATCGACCTG CAGAAGGATT CGAGGGTTAC TTTATGATGA AAACTCTTGG 21300 

ATGTCCCTTC ATTTTATTTT CTGATCCACG TTTAGAGGGG GGAGCTTTTT ATCTCGGAAC 21360 

GGATGAGCTA GAGGGAAAAG TAAAGGATAC GATCACTCAC TATCTTGATT ATTTAGGCTT 21420 

TGATCATAAG GATTTGATTT TATCAGGTCT TTCTATGGGA ACGTTTCCGG CTCTCTATTA 21480 

TGGTGCTTCT TTTGAACCCC ATGCCATC AT AGTTGGTAAG CCCTTGGCTA ATTTAGGAAC 21540 

TATAGCTAGT CGTGGACGTT TGGACGCACC GGGTGTCTCT AACTTAGCTT TTGATTGTTT 21600 

AATTCATCAT ACAGGTGGGA CAAGTTCTCA AGATATGACG GAGTTGGATC AGCGTTTTTG 21660 

GAAAATTTTT AAACAAGCAA ATTTTTCAAA GACAACCTTT GGTTTATCCT ATATGAAAGA 21720 

TGAAGAAATG GATCCACAAG CCTATGAACA ATTAGTGTCT TATCTGTGTA ATACAGGTGC 21780 

GAAGATTTTA TCTAAAGGAA CTGCTGGACG ACACAATGAT GATACAGATA CCAATATTTC 21840 

TTGGTTTTTG CACTTTTATA GAATGGTCTT AGAGACTGGT TTTGGAAGGG AGAAAAGATG 21900 

ATTATTACAC AGAGACAGTC TATTCATTGG GGAGAAGTTG GTGGGACTTA TATGTATGGA 21960 

> ACAACTGTAT CTTATTACCC TGACAAAAGT GTTCGTCTGT ATAATCCTCT ATTGCCATCT 22020 

GGTGAGATTC TAAAGACTTG GTTTTCTAGT GTCAATTACC AGGCTGCACG AACCCAACCT 22080 

CAGGTTCCCT TATTAAAAAG AAAGCAGGAG TATCAACTAT CACTGGTTTT TGACTGTCAG 22140 

CCTGAAAATG GAGTTTATAC CAAGATAACT TTTTTTGACC GCTATGGTGA TATTTTAGAA 22200 

AAAAAGGTAG AAAAAGTGAA AGATTTCATA TTTACTTATC CAGAAGATAG TTATACTTAT 22260 

CGAGTTTCTC TTTTAAGTGC TGGATTTGAG TCCTTAACTT TTTATCATTT TTCTATCAAG 22320 

GAGATCAGAA GTGTTTAGAC GTTTAGGTGA AGATTTCGAG CTTAGGAAAG TGAAAAAGAT 22380 

TTTAAAGCAG ATTAATGCCC TGAAAGGCAA GATGTCCTCT GTTTCGGATC AAGAATTAGT 22440 

AGCTAAAACA GTAGAGTTTC GTCAGCGTCT TTCCGAGGGA GAAAGTCTAG ACGATATTTT 22500 

GGTTGAAGCT TTTGCTGTGG TGCGTGAAGC AGATAAGCGG ATTTTAGGGA TGTTTCCTTA 22560 

TGATGTTCAA GTCATGGGAG CTATTGTCAT GCACTATGGA AATGTTGCTG AGATGAATAC 22620 

GGGGGAAGGT AAGACCTTGA CAGCTACCAT GCCTGTCTAT TTGAACGCTT TTTCAGGAGA 22680 

AGGAGTGATG GTTGTGACTC CTAATGAGTA TTTATCAAAG CGTGATGCCG AGGAAATGGG 22740 

TCAAGTTTAT CGTTTTCTAG GATTGACCAT TGGTGTACCA TTTACGGAAG ATCCAAAGAA 22800 

GGAGATGAAA GCTGAAGAAA AGAAGCTTAT CTATGCTTCG GATATCATCT ACACAACCAA 22860 

TAGTAATTTA GGTTTTGATT ATCTAAATGA TAACCTAGCC TCGAATGAAG AAGGTAAGTT 22920 

TTTACGACCG TTTAACTATG TGATTATTGA TGAAATTGAT GATATCTTGC TTGATAGTGC 22980 
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ACAAACTCCT CTGATTATTG CGGGTTCTCC TCGTGTTCAG TCTAATTACT ATGCGATCAT 23040 

TGATACACTT GTAACAACCT TGGTCGAAGG AGAGGATTAT ATCTTTAAAG AGGAGAAAGA 23100 

GGAGGTTTGG CTCACTACTA AGGGGGCCAA GTCTGCTGAG AATTTCCTAG GGATTGATAA 23160 

TTTATACAAG GAAGAGCATG CGTCTTTTGC TCGTCATTTG GTTTATGCGA TTCGAGCTCA 23220 

TAAGCTCTTT ACTAAAGATA AGGACTATAT CATTCGTGGA AATGAGATGG TACTGGTTGA 23280 

TAAGGGAACA GGGCGTCTAA TGGAAATGAC TAAACTTCAA GGAGGTCTCC ATCAGGCTAT 23340 

TGAAGCCAAG GAACATGTCA AATTATCTCC TGAGACGCGG GCTATGGCCT CGATCACCTA 23400 

TCAGAGTCTT TTTAAGATGT TTAATAAGAT ATCTGGTATG ACAGGGACAG GTAAGGTCGC ■ 23460 

GGAAAAAGAG TTTATTGAAA CTTACAATAT GTCTGTAGTA CGCATTCCAA CCAATCGTCC 23520 

GAGACAACGG ATTGACTATC CAGATAATCT ATATATCACT TTACCTGAAA AAGTGTATGC 23580 

ATCCTTGGAG TACATCAAGC AATACCATGC TAAGGGAAAT CCTTTACTCG TTTTTGTAGG 23 640 

CTCAGTTGAA ATGTCTCAAC TCTATTCGTC TCTCTTGTTT CGTGAAGGGA TTGCCCATAA 23700 

TGTCCTAAAT GCTAATAATG CGGCGCGTGA GGCTCAGATT ATCTCCGAGT CAGGTCAGAT 23760 

GGGGGCTGTG ACAGTGGCTA CCTCTATGGC AGGACGTGGT ACGGATATCA AGCTTGGTAA 23820 

AGGAGTCGCA GAGCTTGGGG GCTTGATTGT TATTGGGACT GAGCGGATGG AAAGTCAGCG 23880 

GATCGACCTA CAAATTCGTG GCCGTTCTGG TCGTCAGGGA GATCCTGGTA TGAGTAAATT 23940 

TTTTGTATCC TTAGAGGATG ATGTTATCAA GAAATTTGGT CCATCTTGGG TGCATAAAAA 24000 

GTACAAAGAC TATCAGGTTC AAGATATGAC TCAACCGGAA GTATTGAAAG GTCGTAAATA 24060 

CCGGAAACTA GTGGAAAAGG CTCAGCATGC CAGTGATAGT GCTGGACGTT CAGCACGTCG 24120 

TCAGACTCTG GAGTATGCTG AAAGTATGAA TATACAACGG GATATAGTCT ATAAAGAGAG 24180 

AAATCGTCTA ATAGATGGTT CTCGTGACTT AGAGGATGTT GTTGTGGATA TCATTGAGAG 24240 

ATATACAGAA GAGGTAGCGG CTGATCACTA TGCTAGTCGT GAATTATTGT TTCACTTTAT 24300 

TGTGACCAAT ATTAGTTTTC ATGTTAAAGA GGTTCCAGAT TATATAGATG TAACTGACAA 24360 

AACTGCAGTT CGTAGCTTTA TGAAGCAGGT GATTGATAAA GAACTTTCTG AAAAGAAAGA 24420 

ATTACTTAAT CAACATGACT TATATGAACA GTTTTTACGA CTTTCACTGC TTAAAGCCAT 24480 

TGATGACAAC TGGGTAGAGC AGGTAGACTA TCTACAACAG CTATCCATGG CTATCGGTGG 24540 

TCAATCTGCT AGTCAGAAAA ATCCAATCGT AGAGTACTAT CAAGAAGCCT ACGCGGGCTT 24600 

TGAAGCTATG AAAGAACAGA TTCATGCGGA TATGGTGCGT AATCTCCTGA TGGGGCTGGT 24660 

TGAGGTCACT CCAAAAGGTG AAATCGTGAC TCATTTTCCA TAAAAGGAGA AAATATGACA 24720 

ATTTACAATA TAAATTTAGG AATTGGTTGG GCTAGTAGCG GTGTTGAATA CGCTCAAGCC 24780 
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TATCGTGCTG GTGTTTTTCG GAAATTAAAT CTGTCCTCTA AGTTTATCTT TACAGATATG 24840 

ATTTTAGCCG ATAATATTCA GCACTTAACA GCCAATATTG GTTTTGATGA TAATCAGGTT 24900 

ATCTGGCTTT ATAATCATTT CACAGATATC AAAATTGCAC CTACTAGCGT GACAGTGGAT 24960 

GATGTCTTGG CTTACTTTGG TGGTGAAGAA AGTCACAGAG AAAAAAATGG CAAGGTTTTA 25020 

CGTGTATTCT TTTTTGACCA AGATAAGTTT GTAACCTGTT ATTTGGTTGA TGAGAACAAG 25080 

GACTTGGTTC AACATGCCGA GTATGTTTTT AAGGGAAACC TGATTCGGAA GGATTACTTT. 25140 

TCTTATACGC GTTATTGTAG CGAGTATTTT GCTCCCAAGG ACAATGTTGC AGTCTTATAC 25200 

CAACGAACTT TTTATAATGA AGACGGGACT CCAGTCTATG ATATCTTGAT GAATCAAGGG 25260 

AAGGAAGAAG TTTATCATTT CAAGGATAAG ATTTTCTATG GAAAGCAAGC TTTTGTGCGT 25320 

GCCTTTATGA AATCTTTGAA TTTGAATAAG TCTGATTTGG TCATTCTCGA TAGGGAGACA 25380 

GGTATTGGAC AGGTTGTGTT TGAGGAAGCA CAGACAGCAC ATCTAGCGGT AGTTGTTCAT 25440 

GCGGAGCATT ATAGTGAAAA TGCTACAAAT GAGGACTATA TCCTTTGGAA TAACTATTAT 25500 

GACTATCAGT TTACCAATGC AGATAAGGTT GACTTCTTTA TCGTGTCTAC TGATAGACAA 25560 

AATGAAGTTC TACAAGAGCA ATTTGCCAAA TATACTCAGC ATCAGCCAAA GATTGTTACC 25620 

ATTCCTGTAG GCAGTATTGA TTCCTTGACA GATTCAAGTC AAGGGCGCAA ACCATTTTCA 25680 

TTGATTACGG CTTCACGTCT TGCCAAAGAA AAGCACATTG ATTGGCTTGT GAAAGCTGTG 25740 

ATTGAAGCTC ATAAGGAGTT ACCGGAACTA ACCTTTGATA TCTATGGTAG TGGTGGAGAA 25800 

GATTCTCTGC TTAGAGAAAT TATTGCAAAT CATCAGGCAG AGGACTATAT CCAACTCAAG 25860 

GGGCATGCGG AACTTTCGCA GATTTATAGC CAGTATGAGG TCTACTTAAC GGCTTCTACC 25920 

AGCGAAGGAT TTGGTCTGAC CTTGATGGAA GCTATTGGTT CAGGTCTACC TCTAATTGGT 25980 

TTTGATGTGC CTTATGGTAA TCAGACCTTT ATAGAGGATG GGCAAAATGG TTATTTGATT 26040 

CCAAGTTCAT CTGACCATGT AGAAGACCAA ATCAAGCAAG CTTATGCCGC TAAGATTTGT 26100 

CAATTGTATC AAGAAAATCG TTTGGAAGCT ATGCGTGCCT ATTCTTACCA AATTGCAGAA 26160 

GGCTTCTTGA CCAAAGAAAT TTTAGAAAAG TGGAAGAAAA CAGTAGAGGA GGTGCTCCAT 26220 

GATTGAACTT TATGATAGTT ACAGTCAAGA AAGTCGAGAT TTACATGAAA GTCTAGGCGC 26280 

TACTGGTCTT TCTCAACTTG GAGTGGTCAT CGATGCAGAT GGTTTTCTGC CTGATGGTCT 26340 

GCTTTCTCCT TTTACCTATT ATCTAGGTTA CGAGGATGGA AAACCTCTCT ATTTTAATCA 26400 

AGTTCCCGTT TCAGATTTTT GGGAAATTTT AGGAGATAAT CAGTCTGCTT GTATTGAAGA 26460 

TGTGACGCAG GAGAGGGCTG TCATTCATTA TGCTGATGGA ATGCAGGCTG GCTTGGTTAA 26520 
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ACAGGTAGAC TGGAAAGACC TAGAAGGTCG AGTACGTCAG GTTGACCACT ACAATCGCTT 26580 

CGGAGCTTGT TTTGCTACAA CGACTTATAG CGCAGATAGC GAGCCGATTA TGACAGTTTA 26640 

CCAAGATGTC AATGGTCAAC AAGTTTTACT GGAAAACCAT GTGACGGGTG ATATCTTATT 26700 

GACTTTGCCA GGTCAGTCCA TGCGTTACTT TGCAAATAAA GTTGAATTTA TCACCTTCTT 26760 

TTTGCAAGAT TTGGAAATAG ATACCAGTCA GCTTATCTTT AATACTCTAG CGACTCCTTT 26820 

CTTGGTTTCC TTCCATCATC CAGATAAATC TGGCTCGGAT GTCTTGGTAT GGCAGGAACC 26880 

TCTCTATGAT GCCATTCCAG GTAATATGCA GTTGATTTTG GAAAGTGATA ATGTGCGTAC 26940 

TAAGAAGATC ATCATTCCAA ATAAGGCGAC TTATGAGCGC GCTTTAGAGT TAACTGACGA 27000 

GAAATACCAT GATCAGTTTG TGCACTTGGG TTATCATTAC CAGTTCAAAC GTGATAATTT 27060 

CCTAAGACGA GATGCCTTAA TCTTGACCAA TTCAGATCAG ATTGAGCAAG TAGAAGCAAT 27120 

CGCAGGAGCC TTGCCTGATG TCACTTTCCG TATTGCAGCG GTGACAGAGA TGTCTTCTAA 27180 

GCTCTTAGAC ATGCTTTGCT ATCCTAATGT GGCCCTTTAC CAGAACGCTA GTCCACAGAA 27240 

GATTCAGGAG CTGTATCAAC TGTCGGATAT TTACTTGGAT ATAAACCACA GTAATGAGTT 27300 

GCTACAGGCA GTGCGTCAGG CCTTTGAGCA CAATCTCTTG ATTCTTGGCT TTAATCAGAC 27360 

GGTGCACAAT AGACTTTATA TCGCTCCAGA CCATCTATTT GAAAGTAGTG AAGTTGCTGC 27420 

TTTGGTTGAG ACCATTAAAT TGGCCCTTTC AGATGTTGAT CAAATGCGTC AGGCACTTGG 27480 

CAAACAAGGC CAACATGCAA ATTATGTTGA CTTGGTGAGA TATCAGGAAA CCATGCAAAC 27540 

TGTTTTAGGA GGCTAACATG TCAGAGGAAG ATTTATTTTA CAAAGACGTT GAAGGCCGCA 27600 

TGGAAGAGTT GAAACAAAAA CCCATCAAGA AGGAAAAAGA AACCCGAGGG GAAAAGATTA 27660 

GTAAGACTTT TTCACTTTTA CTGGGTTTGA TGATTCTGAT TGGTTTGCTC TTTACTTTGC 27720 

TGGGAATTTT GAGGTAGATC TATGATTGAA ATACTAATTG TTTTAGCTAT TATCCTATCT 27780 

CTTGCTTTGA TTGTATTGGT AACTATACAA CCCCGTCAAA ATCAACTATT TTCCATGGAT 27840 

GCCACTAGTA ATATTGGTAA ACCAAGCTAC TGGCAGAGCA ACACCTTGGT CAAGGTGCTC 27900 

ACTTTATTGG TGAGTTTGGC TTTATTTATT CTACTATTAA CCTTTATGGT GATTACTTAT 27960 

AAATAAAAGA AAACTTCAGA TATTCACCTT TTGTGGATTG GTCTGAAGTT TTCTTTTTTA 28020 

TACTCAATGA AAATCAAAGA GCAAACTAGG AAGCTAGCCG CAGGCTgCTC AAAACACCGT 28080 

TTTGAGGTTG TAGATATAAC TGACGAAGTC AGCTCAAAAC ACCGTTTTGA GGTTGTAGAT 28140 

ATAACTGACG AAGTCAGCTC AAAACACCGT TTTGAGGTTG TGGATAGAAC TGACGAAGTC 28200 

AGCTCAAAAC ACCGTTTTGA GGTTGTGGAT AGAACTGACG AAGTCAGCTC AAAACACCGT 28260 

TTTGAGGTTG TGGATAGAAC TGACGAAGTC AGCTCAAAAC ACCGTTTTGA GGTTGTGGAT 28320 



WO 98/18931 



PCT/US97/19588 



591 

AGAACTGACG AAGctCAGTA ACATATATAC AGCAAGGCGA CGCTGACGTG GTTTGAAGAG 28380 

TATTACTGTC TATATTTTTG GTAAAAATCA ACTTTTACTT GGATGAAGGT TTTGGCTTCA 28440 

CGTAGGAGTT GAAGAAGGGT GGCGCGGGTT TCAAATTCTT CTCTTGTCTT GGGCAGACTG 28500 

CGGTTCCGGA AGACTTCCAG ATAACGTTCA ATTTCATGTA GCAAATCAGA AGCAGGATTG 28560 

GTCTGGCTCA GTTGACCTGC AATTTTTGAA AAGAGTTGCG CTAAGATCAG GCTTTCACTG 28620 

GCGGCAAGGT GACAAGTGTT AATCTGTTGG GCCATGTTTC TCAGGATACG ACTTTGTCGC 28680 

TGTCTCATCT CAAAGTAGTG GATATGGTAG TCTGTCTGGT GAAAGAGGTG GTCAGAGTGA 28740 

TCCAAATAGA CCAGTCTGAG GGCTTCTTTC AAAAGCGTGT CTAATTCTGC TACCAGCTGT 28800 

GCTCGGTTGC GTCCGTCTCC TCTGGATAAA TAGTATTTGA AGCGCTGGAG GATATCTTTT 28860 

AACTTTTCTT CCACCAGCGT GTGGTAGTGC TGGATTTCCT CTTCTCGTGA AGGCATATAG 28920 

AGATTAACAA GCAAGGCAAA TCCTGTACCA ATAGCAAAGA GAAGGAATTC ATTGACTAGA 28980 

AGGTCTGGAG AGGTTGACTC TTGAACCAAG AGATGGCTAA CCAAAACAGT GCTTGGTGTG 29040 

ATGCCAATTT CCCAGCCCAT CTTGTAGGCT AAAGGAACGT AGAAGGCCAG ATAGAGGCCG 29100 

AGACTCCAGA TATGAAATCC GCTCAAGTGA AAAGCTAGAA CACCGATAGC CAGAGCTAGA 29160 

AG CAT AG AAA AAAGACGATT GCGAGCCAGT TTTAAAGTAC TTCTACGCGT ATCAGATAGG 29220 

CTCAAGAGAG CGATAATTCC AGCCGAAACT GCTGACGAAA GATTGAGAAA ATAAGCAAGC 29280 

AGGCAGGCAA GACAGGTAGC TAAGATGAGC TTGGTCGTAC GTTGGCTAAT AGACATAAGA 29340 

ATTTCCTAAT AAGTTAGAAT AAAAGCGTAA AAGACAAGAC ATGAGCAGGC TTGCCTTGAT 29400 

GAGTTATTTT TTACGGGTTG CTGCGTATTC GGCAACGGCG GTAAAGAGGA CATCTGTAGA 29460 

AGAGTTAAGG GCTGTTTCAC ATGAGTCTTG GATGACACCA ATCACAAAAC CAACCCCAAC 29520 

AATTTGTATG GCAATATCGT TAGAAATACC GAAAAGGCTA CAAGCAACTG GGATAAGAAG 29580 

GAGGGAACCT CCGGCAATAC CTGAAGCATC ACAGGATGAG ATAGCTGCTA CCACACTGAG 29640 

GACAAAGGCT GTGGCAAAGT CAACAGGAAT TCCAAGAGTG TTAACTGCAG CAAGGGTCAA 29700 

AAGGTTAATG GTAATCGCTA CTCCAGCCAT ATTGATAGTA GAACCGAGTG GGATAGAAAC 29760 

AGAATAGGTA TCTGGGTTGA GTCCAAGGTC ATGGCAGAGT TTCATGTTGA CAGGAATGTT 29820 

AGTCGCAGAA CTACGAGTGA AAAAGGCTGT CACACCGCTG ACACGGAGGC AGTTCCAAAC 29880 

TAGAGGGTAA GGATTGCGTC TCATAAAGAA GAAGGCAATC AAAGGGTTGA CCACAGGGGC 29940 

AACAAAAAGC ATAGTCGTTA CTAATAGAAC CAATAAAATA CCGTAGTTGG CAAGGCTTCC 30000 

GACTCCCTTG TCAGAAATGG TTTTAAAAAC AAGACCAAGG ATTCCAAATG GAGCCAGATT 30060 
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GATGATCCAT TCGACAATTT TAGAAGTCAC GTCAGCGATA GTTTTTAGCA ATTCTTGACT 30120 

ATTTTTACTG GCTTCTCTCA TAGCGATTCC AAAAATGACT GCCCAAGATA AGATTCTAAT 30180 

ATAGTTAGCA GTAAGCAGGG CGTTGACTGG GTTGTCAACC AGTTTGAGCA AGAGGTTGCT 30240 

GAGAACCTGC CCAATCCCAT CTGGTGGTGC AATTTCAGTA TTGGCACTAT TTGGGGTAAT 30300 

TTCAATAGGG ACGATGAAAT TTGCTAGTAC AGCTACAAGA GCAGCGGCGA AAGTCCCTAT 30360 

CATAGGATAT ACAAGAAAAC AACAGTTTTC ATATTGCTAT CTTGTCCCTT TTGATGTTGG 30420 ■ 

GAAAGGGCAT TGGCAACGAG AGCAAAGACT AGGATAGGAG CAACAGCTTT TAGACCTCCA 30480 

ACGAATAAAT CCTCGAGTAG CCCAATCCCT GAGAGATTAG GAAGGGTCAG TCCTAGGATT 30540 

CCCCACAAAG CATACCAATC AAGATACGCT TGACAAGGCT TGCCTTATTC CAAGCATGAA 30600 

TGATTCTTTT CATAATAATC TCCTTTTTGT GTAGTGATTA TGATTATAGT ATAAATGATA 30660 

GACAAAATCA AGAATTTTCT GTCTATTTTT TGAATATTTA TGGAGAATGA GACTGATGAA 30720 

AATATGGTAT AATGAAATAA AGGAGTTTTA TATGCAAAAA TTTATTCAGG CTTATATTGA 30780 

AAAGCTAGAT GTGACAACCA TTATCGAGAA TATTCTAACC AAGGTCATTT CTCTTTTACT 30840 

GCTTTTAATT GTATTTTATA TTGCTAAAAA AATGCTTCAT ACCATGGTGC AGAGAATTGT 30900 

CAAACCTTCT CTAAAAATGT CTCGTCATGA TGTTGGACGC CAAAAAACCA TCTCACGTTT 30960 

ACTAGAAAAT GTGTTTAATT ATACGCTATA TTTCTTTTTA CTCTACTGCA TTTTGTCGAT 31020 

TTTAGGTTTG CCAGTTTCTA GTTTGCTGGC TGGAGCTGGT ATTGCTGGGG TAGCGATTGG 31080 

TATGGGAGCC CAAGGCTTTC TGTCTGATGT CATCAATGGC TTTTTCATCC TCTTTGAACG 31140 

TCAACTGGAT GTGGGAGATG AGGTCGTTCT GACAAATGGA CCGATTACTG TATCGGGTAA 31200 

GGTTGTCAGT GTGGGAATTC GTACGACACA GCTTCGTAGC GAGGAGCAAG CCCTTCACTT 31260 

TGTCCCTAAC CGAAATATCA CAGTTGTTAG CAATTTCTCA CGCACAGACT AGACCTGTTA 31320 

TTTTAAGTAA TTTGTGGTAC AATAGAGGGA GTTTAATAAG GAGAAAAGAT GGTTTTAGAA 31380 

AAGCAGTTGG GCAATGGTTG TACCTGGATA GACCTAGACC TAGGAAAGTT GAATAAACTA 31440 

GAAGACCTTT CTGAAATTTA CGGTTTGGAC AAGGAAACCA TTGAATACGC ACTGGATAGA 31500 

AACGAGCGCG CCCACATGGA CTACCACCGT GAAAGTGAGA CGGTTACCTT TATCTATAAT 31560 

GTCTTAGACG TAAAAAAGGA CAAGGCCTAC TATGAGACTT TTCCCATGAC CTTTATTGTC 31620 

GAGCATCGTC GCCTGATTAC CATTAGTAAT ACCAAGAACG CCTATGTCAT TGAACAGATG 31680 

ACTCGTTATC TGGAGAACCA TGACACGCTT TCGATTTATA AGTTTCTCTT TGCCAGTCTG 31740 

GAAATCATCA GCAATGCCTA CTATCCTGTC ATTGAGCAGA TGGACAAGAG TAGGGATGAG 31800 

GTCAATGACC TCTTGCGCCA GCGAACTACC AAGAAAAACC TCTTTGTCCT GTCTGATTTG 31860 
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GAGACTGGTA TGGTTTATCT GACGGCAGCT GCCAAACAAA ATCGGATTTT GTTAGAGCAT 31920 

ATTCAAGGTC ATGCCTTGTA TCGTAGTTTT GATGAGATTG AGAGAGAACA GTTTGATGAT 31980 

GCCATGATTG AGGCTCATCA GCTGGTATCC ATGACAGACC TAATCTCTCA GATTTTACAG 32040 

CAGCTTTCAG CCTCTTACAA CAATATTCTA AACAATAATC TGAATGACAA TTTGACAACC 32100 

TTGACTATCA TTTCAGTCTT GCTAGCTGTT TTGGCAGTCG TGACAGGCTT TTTCGGAATG 32160 

AATGTTCCGT TACCTTTAAC AGATGAGCCC CATGCTTGGC TCTATATCAG TTTGGCTAGT 32220 

GCAGGTTTGT GGATTGTTTT ATCCTTGTTA CTAAGGAAAA TTGCGAAAAA AAGTTAAGAA 32280 

AAGGAGCCAG AATGGCGATT GAAAATTATA TACCAGATTT TGCTGTGGAA GCAGTCTATG 32340 

ATCTGACAGT CCCAAGCCTG CAGGCGCAGG GAATAAAGGC TGTTTTGGTC GATTTGGATA 32400 

ATACCCTCAT TGCTTGGAAC AACCCTGATG GAACGCCAGA GATGAAGCAA TGGCTACATG 32460 

ACCTTCGGGA CGCGGGTATT GGCATTATCG TAGTGTCAAA TAACACCAAA AAACGCGTTC 32520 

AACGAGCAGT TGAGAAATTT GGGATTGATT ACGTTTACTG GGCCTTGAAG CCCTTCACAT 32580 

TTGGTATTGA CCGTGCTATG AAGGAATTCC ACTATGACAA AAAGGAAGTG GTCATGGTTG 32640 

GTGACCAACT CATGACAGAT ATACGAGCAG CCCACCGTGC AGGGATTCGG TCAATTTTAG 32700 

TCAAACCCTT GGTCCAACAT GACTCAATCA AAACGCAGAT TAACCGAACT CGTGAGCGTC 32760 

GTGTTATG 32768 
(2) INFORMATION FOR SEQ ID NO: 72: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14872 base pairs 

(B) TYPE: nucleic acid 
{C} STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

CCAGTCACAA AGAAATTGAG CGCGTTCAGc TGAGGATGCA CTATGATGCA AGCTACATTT 60 

CATTTGATGG GATATTAAGA AAGGAGATTT TCATGACACT TTTAGATGTA AAACACGTTC 120 

AAAAAATTTA TAAAACACGT TTTCAGGGCA ACCAAGTAGA AGCCCTCAAG GATATTCACT 180 

TTACCGTAGA AAAGGGTGAC TACGTTGCCA TCATGGGTGA GTCTGGTTCT GGTAAATCAA 240 

CTCTTCTCAA TATTCTAGCT ATGTTGGATA AACCAAGTCG TGGTCAGGTT TACTTGAATG 300 

GAACTGACAC CGCAACTATT AAAAATTCAC AGGCTTCTAG TTTCCGGCGT GAAAAGCTAG 360 

GATTTGTCTT CCAAGACTTT AACTTGCTAG ATACTCTGTC TGTTAAGGAC AATATCTTGC 420 
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TTCCGCTTGT 


CTTGTCAAGA 


AGACCTATAA 


CGGAGATGAT 


GAAGAAATTG 


GTGGTGACAG 


480 


CTGAGAATCT 


GGGTATTAAC 


CAATTGCAAG 


AGAAGTACCC 


TTACGAGATT 


TCTGGTGGTC 


540 


AGAAACAGCG 


TGTAGCAGTA 


GCCCGCGCCA 


TCATCACAGA 


ACCTGAAATT 


CTCCTTGCGG 


600 


ACGAGCCAAC 


AGGAGCCCTT 


GATTCCAAGT 


CATCTGCAGC 


CTTACTTGAT 


GTCTTTAATG 


660 


AAATCAATGA 


GCGTGGGCAA 


ACCATCCTCA 


TGGTAACCCA 


CTCAACAGCA 




720 


GGGCCAAGCG 


TGTTCTCTTT 


ATCAAAGACG 


GCATTCTTTA 


CAACCAAATC 


TACCGTGGAG 


780 


AGAAGACAGA 


GCGTCAGATG 


TTCCAAGAAA 


TCTCTGATAC 


CTTGACTGTC 


ATGGCAAGm 


840 


AGGTGAATTA 


GTATGTTTCG 


ATTAACCAAT 


AAGTTAGCGG 


TATCGAACPT 

X 1 X V>\MVl^ X i. 


. 1 1 nniwlL 


900 


CGCAAACTCT 


ACTATCCCTT 


TGCACTGGCT 


GTTCTCTTGG 


CAGTCAPCAT 


V_ lAl(,iL 


you 


TTTTACTCCC 


TAACCTTCAA 


TCCAAAGATT 


GCGGAAATCC 


GTGGAfiRAAP 




1020 


GCAACACTTG 


GATTTGGTAT 


GTTTGTCGTT 


ACCCTTGCGT 


PArfATTATP 


vj X IT— 1 A 1A_» 


1080 


CCAATAGTTT 


TGTCATGAAA 


AACCGTTCCA 


AGGAACTGGG 


a t\ xn i. n x 


ATGTTAGGCT 


1140 


TGGAGAAGCG 


CCATCTAATC 


AGTATGACCT 


TTAAGGAfiTT 


f\\J X UV3 1A1 11 


oubA 1 1 LTAA 


1200 


CTGTTGGAGC 


GGGTATCGGT 


ATTGGAGCCT 


TGTTTGACAA 


w x xnni X x X v_ 


1 lUV.l\at- 


1260 


TCAAACTAAT 


GAAACTGAAG 


GTTGAGCTGG 


TTGCTACCXT 


Cf*AA Affl h. IkT 




1320 


CAGTACTTGT 


TGTC TTTGG A 


TTGATTTTCC 


TAGGCCTCAT 


GTTCCTRAAT 

wX X I— V~ X OAV*V 1 




1380 


TCGCCCGTAT 


GAATGCCCTC 


CAGCTCTCGC 


GTGAGAAAGC 


AAGCGGAGAG 




1440 


GCTTCCTACC 


TCTCCAAACG 


ATTCTTGGTT 


CCATAAGTTT 


AGGGATTGGC 


TATTATCTTG 




CCCTTACGGT 


AACCGATCCT 


CTTACAGCCC 


TAACAACTTT 


CTTCCTAGCT 


GTTTTGCTGG 


x sou 


TTATCTTTGG 


TACTTATCTA 


TTGTTTAATG 


CAGGGATTAC 


AGTCTTCCTA 


CAAATCTTAA 


±o& u 


AGAAAAACAA 


GAAATACTAT 


TACCAACCTA 


ATAACCTCAT 


ATCTGTTTCC 


AACTTGATTT 


1680 


TCCGTATGAA 


GAAAAATGCG 


GTTGGACTAG 


CAACCATCGC 


TATTTTGTCA 


ACAATGGTTT 


1740 


TGGTAACCAT 


GTCAGCAGCG 


ACAAGCATTT 


TCAATTCCGC 


AGAAAGCTTT 


AAAAAAGTTC 


1800 


TAAATCCTCA 


TGATTTTGGG 


GTTTCAGGGC 


AAAATGTTGA 


AAAAGAAGAT 


TTGGACAAAC 


1860 


TCTTGAGCCA 


GTTTGCAAGT 


GACAAAGGTT 


ATAGTGTCAA AGAGAAAGAA 


GTACTTCGTT 


1920 


ACAGTAACTT TGGTATTGCA AATCAAGAAG GAACCAAGTT AACTATTTTT GAAAAAGGAC 


1980 


AAAACCGTGT . 


CCAACCCACA 


ACAGTTTTCA 


TGGTATTTGA 


CCAAAAAGAT. 


TATGAAAATA 


2040 


TGACTGGTCA 


AAAACTGTCT 


CTATCAGGAA 


ATGAGGTCGG 


TCTCTTTGCC 


AAAAATGACG 


2100 


GACTGAAAGG 


ACAGAAAGCT 


CTAACTCTAA 


ATGATCATCA 


ATTTTCTGTC 


AAAGAAGAAT 


2160 


TTAATAAAGA 


TTTCATTGTG 


AACCATGTTC 


CAAATAAGTT 


TAATATCTTG 


ACTACTGATT 


2220 



WO 98/18931 



PCT/US97/19588 



595 



ACAATTACCT 


TGTTGTTCCT GATTTACAAG 


CCTTTTTGGA 


TCAATTCCCA 


GATTCGGCTA 


. 2260 


TCTATAATCA 


GTTTTACGGT GGTATGAATG 


TAAATGTCAG 


TGAAGAAGAA 


CAACTCAAGG 


2340 


TCGCTGAGGA 


GTATGAAAAC TACCTCAATC 


AATTTAATGC 


TCAATTAGAC 


ACAGAAGGTA 


2400 


GCTATGTTTA TGGTAGCAAT CTAGCAGATG CTAGTTCTCA GATGAGTGCC CTCTTTGGTG 


2460 


GTGTCTTCTT 


TATCGGTATT TTCCTATCCA TTATCTTTAT GGTCGGAAGT GTTCTGGTCA 


2520 


TCTACTACAA 


ACAAATTTCT GAAGGCTACG 


AAGACCGTGA 


ACGCTTTATT 


ATCTTGCAGA 


2560 


AAGTCGGTTT 


GGACCAAAAG CAAATCAAGC 


AAACCATCAA 


CAAACAGGTT 


TTAACTGTTT 


2640 


TCTTCCTTCC 


TTTGCTCTTT GCCTTCATAC 


ATCTCGCCTT 


TGCCTACCAT 


ATGCTTAGGC 


2700 


TGATTTTAAA 


AGTGATTGGT GTACTGGATA 


CGACTATGAT 


GTTGATTGTG 


ACCTTGTCTA 


2760 


TCTGCGCTAT 


CTTCCTCATC GCCTATGTGC 


TGATTTTCAT 


GATTACTTCA 


AGAAGTTATC 


2620 


GCAAGATTGT GCAAATGTAA AAAAGATACC TCGACTTCAA AATCGAGGTA TTTCTTGTAT 


2660 


TCTAAATGCT 


GAAAAGTTGT CCGAGCAGGA 


AGGTAACTCC 


CATGGTCAAG 


AGACCAATAG 


2940 


CAAGGTTCCG 


AATCATAGCT GTTTTGGTTG 


GGGCTTTTCC 


AAGTCTAGCA 


CTTGTGTAAC 


3000 


CAGTGAGAAG 


AAGGGCCACA CCGACAATAA 


GGACGGTAGC 


AGGGATGCGG 


TAATCACTTG 


3060 


GAAAAATGGT 


CACTGACAGC ATTGGAGGCA 


AACTTCTAAG 


GAAAAAGGCA 


ACGAAGCTAG 


3120 


AAATGGCAGC 


GTGCCAAGGA TTGGTAAATT 


CTTCATACTC 


AATCCCATAT 


TTTTCCTCTA 


3180 


CCAGAGCCTT 


GAGTGGATTT TTAAGAAAGA 


TCTTATTGGT 


CAAGAGTTGG 


GCAGAAGTTT 


3240 


TGAATTCTCC 


ATTTTGGATA TAAGCAGCAT 


AGAGGGATTT 


TTTGGCTAGT 


TCCCTATCTT 


3300 


GGTCTAGCAA 


GAGTTTTTCT CGCGAAACGG 


CAGCTTCCTC 


GGTATCTTTT 


GGAGTTGAAA 


3360 


CG GAT AC AT A TTCTCCACCA GCCATTGAAA AGGCACCAGC TAAGATAGCC GTAAAACCTG 


3420 


ATAAAAAGAT 


AATCCAGATA TTGGTCGTGG 


CACTGGCAAC 


TCCGATAACC 


ACACCAGCAA 


3480 


TGGAAATAAT 


TCCATCGTTA GCATCAAGAA 


CACCCGCACG 


CAGGATATTT 


AAACGACCTG 


3540 


CAAAATTTGA ATCAATTTCG TGATTTGTTT 


CTGACGCTAA 


ATTTCAAGTT 


CAAGTTAGCC 


3600 


ATCAAGAAGT 


CTTCTCTGGG TGACTTGTAG 


TCCAAGCATT 


TTTTAGGATA 


GTTGTTAATC 


3660 


CACTTTTCGA 


TGAATGCGAC TTCTTTGGGA 


GTCATTTTCT 


TGGTTCCCTT 


AGGTAACCAT 


3720 


CTACGAATGA 


GCCTGTTGTG ATTCTCATTA 


GTTCCCCTTT 


CCCAAGAGGC 


ATAGGGATGT 


3780 


GCATAATAAA 


TGTGCTCCTC AGAAAATACA 


TTAGACAAGC 


GATTGAATTC 


CGTTCCATTA 


3840 


TCTGCCGTGA 


TGGAAAGAAT CTTGTGTTGT 


TTTAAGATGA 


GTTTTAGAGC 


CTGATTGACC 


3900 


ACATCAGCAC 


TTTTATTTGG AATCAATCGG 


ATGATCTGAT 


GTCTACTTTT 


TCGATCGGTC 


3960 
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AAGACAAGCA AGCAGTAGTT TTTCGCTCTC GTAAGTAGAA CTGTATCAAT CTCATAATGC 4020 

CCATTCTCCA AGCGAAAATT GATAGCTTCA AGCCGCTGTT CGATGGATTG ACCAGCAGGT 4080 

TTAAAGTTGG TGCTGGCCTG TTTCTTAAGC GCTTTTCCTT TTCTAGGGTA AAGCAGATCC 4140 

TGTTTGCTTA ACCCCAATTT TCCATGATGA ATCCAATAGT AAATGGTTGA AATTCCCACG 4200 

TTAACCCCTT TAGCCATCAC CATCATTTCA GGCGAAAATT TTTGGTTATG ATAGTGGAGA 42 60 

ATCTTTTCCT TTAGTTCCTT GGTCAAGCTT GATTTCTTGA CCGAGCGCTT GCGATTGTTT 4320 

TCATAAGACT GTTGAGCATA GTCGGCAGAA TAAACCTCTT. TGAAGCGCCC TTTTCCAAGA 4380 

CATTGTCGGA CTGTCCCACG CTTGATTTCA GTGTGGATAG TTTGAGGAAC TTTTCCAAGC 4440 

AGAGAGGCAA TTTCTCTATT TGATTTCCCT TCTTTTTTCC ATCTTTCGAT TAAGCGACGG 4500 

CTATCGATTG TCAAATGTTC GCCTTTTGTA GTATAATGGT TTTGCATCTC TGTGCCTTTC 4560 

TTGTGTTTGT GGTTGAACAA CAAGTATAAC ACAGAGGTGT TTTCTTATGC CTACAAGAGC 4620 

TATCGGCTAG TTGAACCATC TAATTTTTAG GAGGGCTGGG TGGCTAACTT CATTATAGAA 4680 

CTTTCATTTA CGAACATATA GTAAAATGAA ACAAGAACAG AACAAATCGA TCAGGACAGT 4740 

AAAATCTATT TCTAACAATG TTTTAGAAGC AGAGGTGTAC TATTCTAGTT TCAATCTATT 4800 

ATATTTTTGT TTTTTATCAA AAAATACTTT ACAAGTTCTT AAAAACATGA TATAGTAATA 4860 

AAGCTTAGAA AATGAGATGA TGTTTTCTAG CAAATATAAA CCCGAGTAAA AAATGCCTAC 4920 

GGACAGGCAG GGTTGAATGC CGAAGCGTGG TTGAAAAGCC ACATTATTGA TAGGGTTAAA 4980 

AGCCTACTTT TATAAGTTGA TGTTAGGACA CTTGTCCTAA TTCATAAATT TTTAGTGTGG 5040 

TGAAAGCACA CGTCATCTTG TGAAACGATC AATAAAGTAC GTAATATTTG CTACTAGAGA 5100 

GTTAGGAAAC ATCGGGAACA GACATACTCA ACAGAAACCA AAATAAACAC GTCAGAAGAT 5160 

TGCAGAGCAG GTGAAAACCT GCTCTTTTTT CATGAGTCAA CCTTTAGTTC CTTAGTTTTC 5220 

ATAAGGTCCT AAAAATATTG AAAGGAGTAT GTTTTGAAAG AGTTAGATCA AAACCAAGCC 5280 

CCAATTTATG AGGCCTTGGT GAAGTTACGC AAGAAAAGGA TTGTTCCCTT TGATGTTCCA 5340 

GGTCACAAGC GTGGACGGGG AAATCCAGAA CTTGTCGAAC TCTTAGGAGA AAAATGTGTA 5400 

GGCATTGATG TCAATTCGAT GAAACCTTTG GATAATTTAG GCCATCCTAT TTCGATTATT 5460 

CGTGATGCAG AGGAGCTGGC TGCAGATGCT TTTGGAGCTA GCCATGCCTT TCTAATGATT 5520 

GGTGGAACAA CTTCATCGGT GCAGACTATG ATTCTGGCAA CCTGCAAGGC AGGAGATAAG 55B0 

ATTATTCTGC CACGAAATGT CCATAAATCT GCTATCAATG CGTTGGTTCT ATGTGGTGCC 5640 

ATTCCCATCT ATATCGAGAT GAGTGTAGAT CCTAAGATTG GTATCGCTTT AGGTCTTGAA 5700 

AATGACCGAG TAGCACAGGC CATAAAGGAC CATCCAGATG CTAAGGCTAT CCTAATCAAC 5760 
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AATCCTACTT 


ACTACGGCAT 


CTGTTCAGAC 


CTAAAGGGGT 


TGACAGAAAT 


GGCTCATGAA 


5820 


GCTGGCATGA 


TGGTTTTAGT 


AGATGAAGCC 


CACGGAGCGC 


ATTTGCATTT 


CACTGATAAA 


5880 


CTTCCAATTT 


CTGCTATGGA 


TGCAGGGGCT 


GATATGGCAG 


CAGTTTCCAT 


GCATAAGTCT 


5940 


GGTGGGAGTT 


TGACCCAAAG 


CTCCATTTTA 


CTTATCGGGG 


AGCAGATGAA 


TTCTGAATAC 


6000 


GTTCGTCAGA 


TAATTAACCT 


GACCCAGTCT 


ACATCTGCCT 


CTTACTTGTT 


GATGGCTAGT 


6060 


TTGGATATTT 


CACGTCGCAA 


CTTGGCCCTT 


CGTGGTAAAG 


AGTCGTTTGA 


GAAAGTCATT 


6120 


GAGCTATCTG 


AGTATGCCCG 


CCGTGAAATC 


AATGCTATCG 


GTGGCTACTA 


TGCCTACTCA 


6180 


AAAGAGTTAA 


TAGACGGTGT 


TTCGGTTTGC 


GATTTTGACG 


TAACTAAGCT 


GTCAGTTTAC 


6240 


ACTCAGGGTA 


TTGGCTTAAC 


AGGTATCGAG 


GTTTATGACC 


TCTTGCGAGA 


CGAATACGAC 


6300 


ATTCAGATCG 


AGTTTGGTGA 


TATCGGCAAT 


ATCTTGGCCT 


ATATTTCCAT 


CGGCGACCGC 


6360 


ATCCAAGACA TCGAGCGCTT GGTTGGTGCT CTGGCTGATA TTAAGAGACT 


CTATTCAAGA 


6420 


GATGGAAAAG 


ATTTGATAGC 


AGGAGAATAT 


ATTCAGCCCG 


AGTTAGTGCT 


GTCTCCGCAA 


6480 


GAAGCCTTCT 


ATTCAGAAAG 


AAAAAGTTTA 


ACTTTGGATG 


ATTCTGTTGG 


ACAGGTCTGT 


6540 


GGAGAATTTG 


TTATGTGTTA 


CCCTCCAGGT 


ATTCCTATCT 


TGGCTCCTGG 


TGAACGCATT 


6600 


ACACGAGAAA 


TTGTCGACTA 


TATCCAATTC 


GCCAAGGAAC 


GTGGTTGCTC 


CCTCCAAGGG 


6660 


ACGGAAGATC 


CAGAGGTCAA TCATATCAAC GTTATTAAGA GAAAGACAAA CTATAAGAAA 


6720 


AGTCAATAGT 


TTTATCTAAA 


CTATTTCTTA 


TTTCAATTTG 


ATGATTTGGC 


GATGATTTTA 


6780 


GAGCACGGCA 


AAAAGCCCTT 


GAATTAGAAG 


CGGTCAATCG 


CTTAATTTCT 


ATCAGCTTAT 


6840 


CAAATCCTGC 


CTCAAGCCTT 


TTCTGAGGAT 


TAGGGTAGCG 


TGTCAAGAGT 


TGGTAGGTAT 


6900 


ATTCTGAATG 


CTTTCCAACG 


ATTTTATCCA 


ACTCAGGAAA 


GATGATATCA 


AGACAACGAG 


6960 


TGTATTGTAC 


TTTCCAATCA 


GACTGTTTTT 


TCTTGAGACG 


ATGAATATGT 


CTAGCCAGTA 


7020 


TTTTTAGTTC 


TACTTGCCGA 


TTATCGTGTT 


GAAATTGTTC 


ACGATTGGGG 


TCAGAAAGAA 


7080 


GTTTAAGAGC 


GATGCCATGA 


GCGTCTTTCT 


TATCCGTTTT AGTTTTGCGA AGTGATAATG 


7140 


ATTTGGCAAA 


TTTCTTGATG 


AGCAAAGGAT 


TGTAGGTGTA 


AACTTTATAT 


CCTTGTTCAT 


7200 


GCAGGAAGTT 


CAGTAGATTA AAGGCATAAT 


GTCCGGTATT 


TTCAAGAGCG 


ATGAGACAGT 


7260 


CTTGGTTGAG 


CTGTCGAAGA 


GACAGATCTA 


AGAGTTCAAA 


ACCAGCTTTA 


TTATTTGAAA 


7320 


AAGTGAGTGG 


TTTAAGAACA 


GTTTTTCCTG 


GAACATTCAA 


GGCTGTAACA 


TCGTGTTTAT 


7380 


TTTTAGCGAC 


ATCAATGCCC 


ACATAAAGCA 


TGGGAGTATC 


TCCAGATATA 


GTATTTCAAG 


7440 


TCTACTGGGT 


TATCCACGAA 


CTTTTTGCCT 


TGTTACCTTA 


GACGAGATAA 


AACGTCTATG 


7500 
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CGTTATCAAA 
TCAAGCGATT 
CTTTGGCAGT 
TTCTGAAGTT 
TGGAAAAAGT 
GATTTTAAAT 
CGTTCACGTT 
TGACGGCGGT 
GGAACCGGAT 
GCTAGATGAT 
CGAAGATGAT 
ACTCTTTACC 
GATTTACCAG 
CCGCAAGGTC 
CCCAGCTGGC 
TGACAAGGAA 
CGTGGGAGCC 
AAAATGAGTC 
ATTTGTCAAG 
TGCGATGACT 
CTTGATGCTG 
GTTTTGAATG 
GGTGTTCACT 
CGTGCTATCT 
TGGCAGTGGG 
GGTTTTGACC 
GAAATCCATT 
ACCAACTTTA 
GATGGGAAAT 
GTTGGACAAA 



CTCATTACCA 
GGAGGAAATG 
AGCTAACTGC 
CATACTCCAG 
GAATGGCAGG 
GGCCATGTCT 
CCCATGGCTG 
GTTGCCCAAG 
GAGATGTTGG 
CCTCGTGTTA 
TACGATATTA 
AAGGAATTCT 
CATGGGAGTC 
AATCAAGCCT 
TATTGGTTGT 
GGCTGGAAAA 
TTTATGTTGC 
GTTTACTAGT 
ATAGCGAAAC 
TGAAAGCGAA 
ACAAGGTTGA 
TAGCTCTGCC 
ATATCGATAC 
ACGAAAAACG 
CTTATCAAGA 
CAGGTGTAAC 
ATATCGACAT 
ATCCAGAAAT 
GGGTCGAAGT 
AAGATATGTA 



ATTGAAACAA 
AACTAATCCA 
GCTAAATATA 
ATGTCAAATT 
ATATCGAAGT 
TGTTCTCAGA 
TCCACCCAAA 
TATTAACCCT 
TCGAGGTCTG 
CCATTTACTA 
TCATCAACGA 
ACGGCAATAG 
CCTTCTTTGA 
TTCCAATCAG 
TTGGATTTGC 
AACGCCAGCT 
CCAAGTATGT 
TATTGGTTGT 
ATTTACAGAG 
GCTAGAAGGC 
AGAAGTGATT 
TTATCAAGAT 
AGCCAACTAC 
TTGTAAGGAA 
GAAATTCAAA 
TAGTGTCTTT 
TTTAGACTGT 
TAATCTCCGT 
CGAAGCTATG 
TCTCCTTCAC 



598 
AAAACTGTGG 

CAGTGGCTTA 

ATATAAGGAG 

GTCTCTGAGA 

CTTGGATACG 

TGCGGATGAT 

TCCAAAGAAA 

CTATCCTGAA 

TCGTGAGTAT 

CCAAAATGGG 

TGCGACAGAT 

TTATCGAGCT 

CGAGGATGAG 

TCGGGTTTAT 

ATCGAAAAAA 

TTTCACAGAA 

TGAGGACATT 

GGGGGCGTTG 

ATTATGATTG 

AAAACAAGTA 

GCCCTGATTG 

TTAACCATTA 

GAAGCAGAAG 

CTTGGTTTTA 

GAAGCAGGCT 

TCAGCTTATG 

AATGGCGGTG 

GAGGTTTCTG 

TCTATCAAGC 

CATGAAGAAA 



TTAGAGCCTT 
TTCCAAGTAT 
AAATAGATGG 
ACAGCCAAGC 
CCAGCTTTTG 
TTCGTCTACA 
GTATTGGTTA 
CTGGAGCAAA 
TTCCCAGAGT 
CTACGCTTTT 
CCATTTGGCC 
CTGAAGGAAG 
TCGGCCTGCC 
CAGGCCCATA 
TACCACCCTG 
TACTACACTG 
TTAGAAGAAG 
CCCAAGTTGC 
CTAGCCGTAC 
CTAAAATTGA 
AAAGCTACAA 
TGGATGCTTG 
ACACAGAAGA 
CAGCCTACTT 
TGACTGCTCT 
CCCTCAAACA 
ACCACGGTTA 
CGCCAGGTTC 
GTGAGTATGA 
TCGAATCATT 



TCGGAAATCG 
ACCACTTGGG 
ATTTATGGTT 
AACTTTACGC 
GGAAAATACT 
ATGAAATGAC 
TTGGGGGTGG 
TTGATATTGT 
TTGCTGCAGG 
TGCGAAACTG 
ATACGGAAGG 
ACGGCATCAT 
GAAGCATGCA 
TTCCAACTAG 
TCAAAGATTT 
CAAACTTACA 
AGGAAGGAAA 
TATTTCAAAG 
CAAGTCAAAA 
AACTGCAGCA 
ACCAGAAGCT 
TTTGGCAACA 
CCCTGAGTGG 
TGACTACTCA 
TCTTGGTTCT 
CTATTTTGAT 
TCCATTTGCA 
TTACTGGGAA 
TTTCCCTCAA 
GGCCAAGAAC 



7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

B580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 
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ATTCCAGGTG TCAAACGCAT TCGTTTCTTT ATGACTTTTG GTCAATCTTA CTTGACGCAC 9360 

ATGAAATGTC TTGAAAATGT TGGACTCCTT CGTACGGATA CCATTAACTT TAACGGCCAA 9420 

GAAATTGTTC CAATTCAATT TTTGAAAGCC TTGCTTCCAG ATCCTGCCAG TCTTGGGCCA 9480 

CGTACAGTCG GAAAAACCAA TATTGGATGT ATCTTTACAG GTGTCAAAGA CGGTGTCAAA 9540 

AAGACTATCT ATATCTACAA TGTCTGCGAC CATCAGGAAT GTTACGCAGA GGTTGGTTCG 9600 

CAAGCTATTT CTTATACGAC AGGAGTTCCA GCCATGATTG GGACAAAATT AGTCATGAAC 9660 

GGAACTTGGA AACAAGCTGG AGTGTATAAC CTTGAGGAGT TAGATCCAGA TCCATTCATG 9720 

GAAGCTTTGA ATGAGTATGG TTTGCCATGG GTTGTGGTTG AAAATCCACA AATGGTGGAC 9780 

TAATGAAGTT AGAACAAGTA CCAACACCAG CCTATGTTAT TGACTTGGCC AAGTTAGAAG 9840 

CTAATTGCCG CATTCTACAA TATGTACAAG AAGAGGCCGG TTGCAAGGTC TTGCTTGCCC 9900 

AGAAGGCATA TTCCCTCTAC AAAACTTATC CCTTGATTAG CCAGTATCTA TCAGGTACGA 9960 

CAGCTAGTGG ACTCTATGAG GCCAAATTGG CAAGGGAAGA ATTTCCTGGT GAAGTCCATG 10020 

TATTTGCGCC TGCTTTCAAG GATGCAGACT TGGAGGAATT GCTAGAGATA ATGGACCATA 10080 

TAGTCTTTAA CTCAGAGAGA CAGTTGCGTA AACACGGTCC GCGTTGTCGA GAGGCTGGTG 10140 

TCAGTGTTGG TTTGCGCCTC AACCCTCAGT GTTCAACTCA AGGcAGATCA CGCGCTCTAT 10200 

GACCCTTGTG CACCAGGTTC TCGCTTTGGA GTTACTATAG ACAAGATTCC GAGTGATTTG 10260 

CTAGATTTGG TTGACGGACT TCATTTTCAT ACCCTTTGCG AGCAGGGAGC AGATGATTTA 10320 

CAAACAACTT TGAAAGCAGT AGAAGAACAG TTTGGTCCCT ACTTACATGA GGTAAAATGG 10380 

CTCAATATGG GTGGTGGTCA TCATATTACA AGAGAAGGTT ACGATGTGGA TTTGCTGATT 10440 

TCAGAAATCA AGCGTATCCG AAAAACTTAC AATCTTGAAA TCTATATCGA GCCTGGTGAA 10500 

GCCATTGCGC TTAATGCGGG TTATTTAGCA ACTGAGGTAT TAGATATTGT AGAAAACGGT 10560 

ATGGAAATCT TGGTTTTAGA CGCCTCTGCG ACCTGCCATA TGCCTGATGT ACTTGAGATG 10620 

CCCTATCGTC CACCTTTGAG AAATGGCTTT GAGTCACAGG AAAAAGCCCA TACCTACAGA 10680 

CTTTCTTCTA ATACCTGTCT GACGGGCGAT GTGATTGGTG ATTATAGTTT TGAAAATCCA 10740 

GTCCAAATCG GAGACAGACT TTATTTTCAA GACATGGCCA TTTATTCTTT TGTCAAAAAT 10800 

AATACCTTTA ATGGTATTGG ATTGCCAAGT CTCTATCTCA TGGACGAACA GGGAGACTGT 10860 

AGCTTACTCA AAGCTTTTGG CTATCAAGAC TTTAAAGGGA GATTATCATG ATGGACAGTG 10920 

CAAAAAAATT AGGCTATCAC ATGCCAGCAG AGTACGAACC GCATCATGGT ACCCTCATGA 10980 

TATGGCCGAC TCGACCAGGA TCATGGCCTT TTCAAGGAAA GGCTGCTAAA AGAGCATTTA 11040 
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CTCAGATTAT CGAGACCATA GCAGAAGGGG AAAGAGTCTA TCTTTTGGTG GAGCAGGCCT 11100 

ATCTATCTGA AGCCCAATCC TATCTTGGAG ACAAGGTTGT TTATTTAGAC ATTCCCACCA 11160 

ATGATGCCTG GGCGCGTGAT ACTGGCCCAA CCATTCTCGT CAATGATAAA GGTAAGAAAT 11220 

TAGCCGTGGA TTGGGCCTTC AATGCTTGGG GAGGCACCTA TGATGGTCTT TATCAAGATT 11280 

ATGAAGAGGA TGACCAAGTA GCCAGTCGTT TTGCTGAGGC CTTGGAAAGG CCTGTCTATG 11340 

ATGCTAAACC TTTTGTACTG GAAGGAGGCG CAATCCATAG CGATGGTCAA GGAACTATTC 11400 

TCGTAACTGA AAGTTGCTTG CTTAGTCCTG GTCGCAATCC TAACTTGACT AAAGAGGAGA 11460 

TTGAAAACAC ATTATTAGAA AGTCTTGGTG CTGAAAAAGT TATTTGGCTT CCTTATGGTA 11520 

TTTATCAGGA TGAAACCAAT GAACACGTCG ATAATGTTGC TGCCTTTGTT GGTCCTGCTG 11580 

AGCTTGTTTT GGCTTGGACA GATGACGAAA ATGATCCCCA GTATGCCATG TCAAAAGCAG 11640 

ATCTCGAACT CTTAGAACAG GAAACAGATG CAAAAGGTTG TCACTTCACC ATTCATAAAT 11700 

TGCCTATCCC TGCAGTTCGA CAAGTTGTGA CAGAAGAAGA TTTGCCAGGC TACATCTATG 11760 

AAGAAGGAGA AGAAAAGCGA TACGCAGGTG AACGACTAGC AGCTTCCTAC GTAAACTTTT 11820 

ATATCGCCAA CAAGGCTGTC TTGGTTCCAC AGTTTGAGGA TGTAAACGAC CAAGTGGCCT 11880 

TAGATATCCT CAGCAAGTGT TTCCCAGACC GTAAAGTTGT CGGAATACCA GCCAGAGATA 11940 

TTCTCTTAGG TGGTGGCAAT ATCCACTGTA TCACCCAACA AATTCCAGAA TAGGAGAAAA 12000 

AGATGAGAAA TGTAAGAGTT GCAACCATTC AGATGCAATG CGCTAAGGAT GTGGCAACAA 12060 

ATATCCAAAC CGCAGAGCGT TTAGTACGTG AGGCTGCTGA GCAAGGAGCC CAAATTATTC 12120 

TCTTGCCCGA GTTGTTTGAA GATCCCTATT TCTGTCAGGA ACGTCAGTAT GACTACTACC 12180 

AGTATGCCCA ATCTGTAGCG GAAAATACTG CCATTCAGCA TTTTAAGGTG ATTGCTAAGG 12240 

AACTACAAGT TGTTTTACCA ATCAGTTTCT ATGAAAAAGA TGGTAATGTC TTGTATAACT 12300 

CTATTGCCGT CATTGATGCA GATGGGGAAG TGCTGGGCGT TTATCGAAAG ACCCATATAC 12360 

CAGATGACCA TTATTATCAA GAAAAATTCT ATTTCACGCC TGGTAACACT GGTTTCAAGG 12420 

TCTGGAATAC TCGCTATGCT AAGATTGGTA TCGGTATCTG TTGGGATCAA TGGTTCCCTG 12480 

AAACAGCGCG CTGTCTTGCA TTGAATGGTG CTGAATTGCT CTTTTATCCT ACAGCTATCG 12540 

GTTCAGAGCC AATTTTGGAT ACAGATAGTT GTGGTCACTG GCAACGTACT ATGCAAGGGC 12600 

ACGCAGCAGC GAATATTGTT CCAGTCATCG CAGCCAATCG TTATGGTTTA GAGGAGGTTA 12 660 

CTCCTAGTGA GGAAAATGGC GGACAGAGCT CCAGTCTTGA CTTCTACGGT TCCTCCTTTA 12720 

TGACGGATGA AACAGGAGCT ATTCTAGAAC GAGCTGAAAG ACAAGAAGAA GCTGTTCTGT 12780 

TAGCTACTTA TGACCTAGAC AAGGGAGCAA GTGAACGCCT AAACTGGGGC TTGTTTCGAG 12840 
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ATAGAAGACC 


AGAAATGTAT 


AGACAAATTA 


CAGATTAGTG 


TGGGAGAAAT 


GAGAGATTCA 


12900 


TTCTGCTAGA 


CTAACTTCTT 


ATTAGTAACT 


ATAAGATACT 


ATGGCATCTA 


GTAAATCGAT 


12960 


TTTTATGATT 


CGCTATTCTT 


GTCTATTGAT 


TAGTCCGTAT 


TTTAAAATAT 


TAGCAAAAAA 


13020 


GCAAATAGCA 


GTAACTTCTG 


TCTATTTGCT 


TTTCTTTTTT 


ATAGAATATA 


TTTCTCAATA 


13080 


GCACGCGCAA 


CGCCGTCTTC 


TTCGTTGCTT 


GAGGTAACGG 


CATCGGCAAG 


AGATTTGATA 


13140 


TAATCGCTGG 


CATTTCCCAT TGCAATCCCA AGCCCTGCAA ACTGGAGCAT 


TTGGATATCG 


13200 


TTATTAGCAT 


CGCCCATGGC 


CATAATCTCT 


GAGGAATCAA 


TCTTCAAAAT 


CTCAGCTAGT 


13260 


CGTGAAAGAG 


CAGTAGCGTT 


TGTCGTTCCA 


AGCGGCATTG 


CTTCATAAAT 


GACAGGCTGC 


13320 


GAACGAACTC 


CACTGAATCG 


TTGGCAAAGC 


TCTTCAGCAA 


AACGCTGGTC 


AAAATCGTCT 


13380 


GTTTGTTCTT 


TTGTTCCTAA 


ACACATACCT 


TGGAACATCC 


GGAACTTTCC 


ACTAGTCGCT 


13440 


TCTTCAAGAG 


AAATTTCAGT 


CAGGTCTGAA 


AATACTAGTT 


TAGCATCATT 


TTCAATAACT 


13500 


TGATTGGGCT 


TGTCACCGAG 


AACAAAATAA TGTGACTCGT 


CAAAAAGTGT 


CAACTGAACA 


13560 


TCACTCTTTT 


CAGCAAGGTC 


ATAGAGGTAT 


TCGATGTCAG 


CTGGACTCAG 


TTCTTTCCAG 


13620 


TCAACTAGAC 


TCCAATCACT 


GGTCTGGTGA 


GTTGAACAAC 


CGTTGTTAAC 


AATAATATAT 


13680 


TCGTTCTGGA 


GGTCAAGCTC 


CAGTTTTTTG 


TAGTAGGGGA 


GGACACCGAA 


AAGGGGGCGA 


13740 


CCCGTACAGA 


GAACCAGTTT 


GACACCTTTT 


TCAATGGCTT 


TGTGAATAGC 


AGTAATGTGT 


13800 


GCTTGTGGGA 


TTTCCTTGGC 


TTCATTGAGG 


AGGGTGCCGT 


CCATATCCAA 


GGCTAGTAGT 


13860 


TTAATCATAG 


GTCTTCCTCT 


TTATCTTTGC 


TATTATTATA 


GCATATTTTG 


GAGAAGAAAT 


13920 


TGATAGAAAG 


CTTGAGACTA 


ATTGATTTTA 


TAGTTTAAGA 


TGTTTTGATG 


ACAATTCATG 


13980 


ATTTGAAGAG 


GATATTTCGC 


AAAGATATGC 


TATACTATGT 


TTGTCAATGT 


TGCAACTAGA 


14040 


CAAATTAAAA 


AACCAACTTA ATATAATAGT TTTTTTGTAA 


GTAGGTATGA 


GTAGCAGATT 


14100 


ACTCAACTAA 


TCTGAAGAAT 


AATGGAGGAA 


ATATATCATG 


ATTTTAATGA 


CAAAAAATAT 


14160 


AAATCTAACA 


AATGAAGAAT 


TAGAGCTGAT 


ACAAGGTGGA GCAGATCCAT 


ATGGTAAAAA 


14220 


TCCTAATGGT 


AGGTACGATT 


GGGAAATAGA 


ACCAGTATTA 


ACTCTGCTGG 


TTCATGGATT 


14280 


TTGTCCCAGA 


GGCACCTATG 


ATTCAGGATA 


TATTGGAGGA 


GGTAATCATG 


TTTGCAAAGG 


14340 


AAGTGCTGCG 


AGATTTTAAG 


TAAAATTTAT 


TAGGAATATG 


AAGAAACAAG 


GGGAGAAAAC 


14400 


AGAGGATTTA 


ATATGAAAAA 


ACGAGCTATT 


CAAATTTTAC 


TAGCATTGTC 


CTTAATTTTT 


14460 


TACAAATCAA 


CTTGGTTTTG 


GAGGCTTTTC 


AATTATCTGG 


CAAAGCCCTA 


TCTACCAGCA 


14520 


AGTCGTGAAT 


TTTTTCAGAT 


TCTGCTTTTG 


ATGGAGAGCG 


GAGTTCTTTT 


CTTAGCGGTC 


14580 
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ATCTATCTAC TGGTTTTTGC AGGAAAGAAA ATTTTTCATT TCAAGTGGCA GCTGAGGTAC 
TTCATCTACC TTTTACTGGG CTACATCATT TCATATATGT CTGACTTCCT CTTTTCGTAT 
TTCATATCCC TGTCTTCAAA TCAGATTTCT TTGAATGAAA CGGTAGAAAT GATGGGGAGA 
CAGGAGTTCC CTTATGTCTT GCTCATCGTT TGCTTCATCG CCCCTATTGC TGAGGAATTG 
ATTTATCGAG GtGTGCTTAT GACAACCTGT TGCAAAAACT CACCTTGGTA CG 
(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10223 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



14640 
14700 
14760 
14820 
14872 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

CGTGCTATCG GTCTCAAAAC CAATCTGGTC GCTATGGTCA AATCCAGTTG GAAAATCCAT 60 

TCTTCTTGGA GCCATCTGCT GGATTGCCAT CATCCTCACC ACTCTTGGTA TGCAGACCCT 120 

TATCGGCATT TTCTAATACT CTTCGAAAAT CTCTTCAAAC CACGTCAACG TCGCCTTGCC 180 

GTAGGTATAT GTTACTGACT TCGTCAGTTC TATCTGCAAC CTCAAAACGG TGTTTGAGCT 240 

GACTTCGTCA GTTCTATCTG CAACCTCAAA ACGGTGTTTT GAGCTGACTT CGTCAGTCGT 300 

ATCTACAACC TCAAAACAGT GTTTTGAGCT GACTTCGTCA GTTCTATCTG CAACCTCAAA 360 

ACAGTGTTTT GAGCAGCCCG TGGCTAGTTT CCTAGTTTGC TCTTTGATTT TCATTGAGTA 420 

TAACACAAAA GGTAGCCCAT CAGCTACCTT TTTCTTATGC TTCCTCAATC AAGCGAGTAT 480 

GTTCTCTCTT GATACAGCGA TTCATCACGA TATCATCACA TCCACCATCA CGCAAAATCT 540 

CTTTCGCTTC TAAACTTTCA AGTCCTAGCT GTGCCCAAAA AATCTTGGCA TCAGCTTTGA 600 

GAAAATCACG CGCCACATCG GGCAGAAATT CACTGCGACG ATAAACATTG ACAATATCTA 660 

CAGGAAAAGG AATTTCAGCG AGGCTAGCAT AAGCCTTTTC ACCCAAGATT TCGCCACCTG 720 

CCGCCTTGGG ATTGACTGGG ATGATTTTAT AGCCCCGAGC CTGCATTTCC TTTGTTACTC 780 

GATTGCTGGT TGTTTCTTCA CGGTCAGACA AACCCACCAC AGCAAGGGTT TTACTCGTTG 840 

CGAGATACTG ACGAATCACG CCATCACTTG GATTGATAAA TTCTTGACTC ATAGAAATCC 900 

TCCTTTTTCA TCAGTATAGC ACATTTTGAA AAGGTTTGCA GAATTATACT ACAAAAAAGG 960 

AGGACTAGCC CCCTTTTTAT TTAGCCTCGT ACCAGGTTGC CCCTTCATTC TCATCTGCGA 1020 

TAAGAGGAAC ACTGAGTTGA ATGGCTTCTT CCATGGTTTG TTTCACCAAT TTTTTCATCT 1080 

CTACCAATTC AGATTTAGGC ACTTCAAGGA CGATTTCATC GTGCACTTGT AACAGCATCT 1140 
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TAGTCTGATA ACCACCTGCA ACCAAGGCTT TATCCAGCTG AATCATGGCA ATCTTGAGAA 


1200 


TATCTGCTGC CGAACCCTGG ATAGGTGAGT TGATAGCAGT TCGCTCCGCA AAACCACGAA 


1260 


TATTGAAGTT GCGCGAATTG ATATCTGGCA ACTCACGGCG ACGCTTAAAG AGGGTCTCTA 


1320 


CATAGCCCTT 


ATCACGCGCC TCCCGCACCA CTTCATCCAT 


GTAGTTTTTA 


ATACCTGGAA 


1380 


AACGTTCAAA 


GTAGGTATCA ATGTAGGCTT TGGCTTCCTT 


ACGACTAATT 


CCCAAATTAT 


1440 


TAGACAAGCC 


AAAGTCTGAA AT CC CAT AAA CCACTCCAAA 


GTTAACTGCC 


TTGGCATTGC 


1500 


GACGGTCGTT 


TGCAGTCACA TCATCAGGAC GCTCAATGCC 


AAAGACCCGC 


ATGGCTGTCG 


1560 


AAGTATGGAT 


ATCTGCCCCC TCTTGGAAGG CCTTAATCAA 


GTGCTCATCC 


TTAGAAATAT 


1620 


GCGCCAAAAC 


GCGCAATTCA ATCTGTGAAT AGTCAGAGCT 


GAGTAGCACA 


CTATCCTCCC 


1680 


ACTCTGGCAC 


AAAAGCCTTC CGAATCAAGC GCCCCTGTTC 


CAATCGGGCA 


GGAATATTTT 


1740 


GCAAGTTTGG 


ATCCACACTA GACAAACGCC CGGTCTGGGT 


CAAATCCTGC 


ACATAGCGAG 


1800 


TATGAATCTT 


TCCATCAGCC AAAATCCAGT CCTGCAAGCG 


AATTACATAA 


GTAGATTGAA 


1860 


TCTTAGCAAT 


TTGACGGTAA TCCAGGATTT TGTTAACAAT 


CGGAGCAATA 


GGAGCGAGAC 


1920 


GCTCTAAAAC 


ATCCACTGCT GTCGAATAAC CTGTCTTGGT 


TTTCTTAGTG 


TATTCTAGAG 


1980 


GAAGTCCCAA 


TTTCTCAAAG AGAAGCACGC CCAACTGCTT 


AGGCGAGTTG 


ACATTAAACT 


2040 


CCTCACCAGC 


CAGCTCGTAA ATCTCTTGAG TCAGTTTTTC 


AATGACAAGC 


TCATTTTCAG 


2100 


CCTGCATCTC 


AAGCAAGGTC TCTTTCTTGA CCATAATCCC 


AGCAATTTCC 


ATCTTGGCAA 


2160 


GGACAAAAGC 


CAGAGGTTGC TCCATATCAT AAAGAAGCTC 


TAATTGCCCA 


TTTTCGCTGA 


2220 


GTTTTTCAAG 


TAAAATAGGC TCTGTTTCTA CCAAAACAGC 


AAGTTTACAA 


GCTAAGTGTT 


2280 


CCAAGAATTT CTCACGTTCA GGAATGGCCT TTTTAACACC 


CTTACCGTAG 


AAAGTTTCAT 


2340 


CATCAACCAA 


GTAAGTCTGA CCATAAAGAC TAGCGATGGT 


CGCAATTTCA 


TTGTCCTCCA 


2400 


CAGTCGAAAG 


GAGGTATTTA GCCAAACGGA TGTCAAAAGC 


AGGCGCGTGC 


AAATCCACAC 


2460 


CAAAACGTTG 


CAAAAGAACT TTAACCTTCT TAAAGTCATA 


AACTCTCAGA 


GATGTTTTTT 


2520 


CTAAGAAATC 


CTTGAAAATC GGGTCTTGCA ACAGCTCAAG 


CTTGTCTGTG 


GCATAGAGCT 


2580 


TATCCCCACA 


AGACCAGACA AATCCAACCA AATTATCCGT 


ATGGTAATTC 


TCACCAAAAA 


2640 


GCTCAAAGTG 


GAAGATAGAC TCTTCACTCA GCATATCTTG 


ACTGATTTGG 


TCAACAATAG 


2700 


TAAAATCCAA ACTCTCAGAC ACATCAGCTG ACGACACATT TAAAGCCTGC TTTAGCTGTT 


2760 


TGAAGCCCAT 


CTCATCGTAG AATTTCCCAA GATTTTCAAC 


ATCTGGACCA 


CTATAGAGCA 


2820 


AGTCCTCTAA 


ACCAATCGCA ATCGGTGCCT TGGTATCAAT 


GGTCGCTAGT 


GTTTTAGACA 


2880 
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AAAAGGCCTG TTCCTTGTCA TTGATGAGAT TTTCCTTCAT CTTAGAAGTC TTCATTCCAT 2940 

CAATATTTTC ATAAATCCCC TCAAGCGAAC CATGCTCCAG CAAGAGCTTA ATACCCGTCT 3000 

TTTCACCGAC TTTGGTCACC CCAGGGATAT TATCCGACTT ATCACCCATG AGCGCCTTGA 3060 

GATCGATAAA CTGAGCTGGT GTGAGGCCCA TTTCTTCCAT GAGGTAATCT GGCGTAAAGG 3120 

CCTCAAACTC AGCCACACCT TTCTTGGAAA TTTCAACCAC CGTATGCTCA TCCGTCAGCT 3180 

GAATCAAATC CTTGTCCCCA CTGACAATAG TAATATCAAA ACCATCCTGC TCTGCTAGCT 3240 

TATCCAGCGT CCCAATGATG TCATCCGCCT CATACTGAGC CAGATCATAG TGACGAATCC 3300 

CCATATGATC CAGCAACTCA CGAATGAAAG GAAATTGCTC ACGAAACTCA TCAGGAGTCT 3360 

TGGCCCGACC ACCCTTATAG TCCGCATACA TCTCTGTCCG GAAGGTCGTC TTTCCCGCAT 3420 

CAAAAGCCAC CAAAATATGA CTCGGCTCAA CCCGCTCCAA TAAATGACTC AACATCAACT 3480 

GAAAACCATA AATCGCATTG GTATGCAAAC CAGCCACATT CTTAAAACGG X TCCAACTGCT 3540 

GATACAGCGC AAAAAACGCC CGAAAAGCTA CAGAAGACCC ATCAATCAAT AATAATTTTT 3600 

TCTTATCCAT ACACCCATTA TAAAGGAAAG AATCAAAAAA TACCATTGGG AAGAGCTAGA 3660 

GCAAGTATTT TTCAAACTTT TTCCGAATAA ATAGATAGAG CCAGAGAATT TAGTAAACCT 3720 

AGATTTAAAA ATGTGCTATA ATATAGTATA TTGAATCTAT AATAGTACAC CTTGACTGCT 3780 

AAAATATTTC TATAAATTAA TTTGACTTTC CTGATAGAGT TATTCACATC TTATTTCAAC 3840 

TCACTATAGA AGGAGGAATA GGAGGATTCT CAGACATCCG GGCATCAGCC CAACTAATGA 3900 

TTTGATTGCT AAGAAAATAT TCAGCAATCC AGAAATCACT TGTCAATTTA TTCGCGATAT 3960 

GCTGGACTTG CCAGCAAAAA ATGTGACCAT TTTGGAGGGA AGCGATATTC AGGTATTACT 4020 

CTCCATGCCT TACTCGGTGC AGGATTTTTA TACCAGTATA GACGTCTTGG CGGAGTTGGA 4080 

TAACGGTACT CAAGTAATTA TTGAGATTCA AGTCCATCAT CAGAATTTTT TCATCAATCA 4140 

CTTGTGGGCT TACCTGTGCA GTCAGGTTAA TCAAAATCTT GAAAAAATTC GTCAGCGAGA 4200 

AGGTGATACT CACTAGAGCT ACAAACACAT CGCTCCTGTT TACGCCATTG CTATCGTGGA 4260 

TAGTAATTAT TTCTCAGATG ACCTGGCTTT TCATAGCTTT AGTATGCGCG AAGACACAAC 4320 

AGGTGAGGTA TTGGCGATTA CCAACAATGG ACAGGAAAAC CATCTGGTTA AGATGGCATT 4380 

CTTGGAATTA AAAAATACAG AGAAACCAGC AAAGACAAGG TTCGCAAGCC ATGGTTGGAG 4440 

TTTTTCGGCA ACAAGCCCTT TACCCAGCAA CCGCAACGAG CCATTACCCA AGCAAATCAA 4500 

CTGCTGGACT ACAAGAGCTG GTCCGAGGAG GACAGGAAAA TGTTTAGTCA ACTACATATG 4560 

CGAGAAGAAC AAGTCTTGTT AGCACAGGAC TATGCCTTGG AAACTGCTAG GGCTGAAGGC 4620 

CTTGAACAAG GACTAGAGCG TGGGAAAGTT GAAGGAAGGG CAGAAAGGAA ACTTTTTGCC 4680 
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TTCCTAGACA TAGTACGCCA 


AGGTCTTCTG 


ACTTCTGAGG 


TTGCCAGCCA 


GCAATTAGGT 


4740 


ATGTCAGTAT CTGAATTTGA GGCACTGTTG 


TAAAATGGCT 


CCATAATATC 


CATAGTGGGT 


4800 


AAATCCCCTA TGGATATTAT GGAGCCTATT TTGTGTAGAA AAAAAGTCCC 


ATATGACCTA 


4660 


TAATGAAAAG CGACAAAACA ACTCATTAGA AAGAATCATA TGGAACAATT 


ACATTTTATC 


4920 


ACAAAATTAC TAGACATTAA AGACCCTAAT GTCCAGATTT TAAACATCAT 


CAATAAGGAT 


4980 


ACACACAAGG AAATCATCGC 


CAAACTGGAC 


TACGACGCCC 


CATCTTGCCC 


TGAGTGCGGA 


5040 


AACCAATTGA AGAAATATGA 


CTTTCAAAAA 


CCTTCTAAAA 


TTCCTTATCT 


TGAAACGACT 


5100 


GGTATGCCTA CAAGAATTCT CCTT AGAAAG CGTCGATTCA AGTGCTATCA CTGTTCAAAA 


5160 


ATGATGGTCG CTGAAACTTC 


TGATGACGTA 


CAGTCATATT 


TCTTCTCTTT 


TTATTATATC 


5220 


ACAGTTTTAA ATCTAGCTTT ACTAGATTCA CCGCTACTAT 


CTATTTATTC 


GGAAAAAAGA 


5280 


CGAAAAAACC TGAGAATCAT 


CTCAGGCTTG 


GTCATTAAAT 


TTTTTTCTCA 


ATATCGAAAA 


5340 


GTGGAGAAAG TGGTCGTTTT 


TCATGAATAG 


GTACGATAGC 


ATCCCCTAGG 


AGATGAGCGA 


5400 


TTGAAATCTG CTCAATCTTA 


TCAATCAAAC 


GCTCTTCTGG 


CAGATAGATG 


GTATCCAAAA 


5460 


CAACCAATTT CTTAATAGCT 


GATTTTTGGA TATTGTCCGT AGCAGGACCA GAAAGAACTG 


5520 


GGTGCGTACA GCTTGCATAG 


ACTTCAACAG 


CACCAGCTTC 


CGCAAGAGCA 


TCTGCCGCAT 


5580 


GACAAATCGT TCCAGCGGTA 


TCAATCATAT 


CATCAATCAA 


GATACAAGTC 


TTGCCTTCAA 


5640 


CCTTACCGAT GATATTCATA 


ACTTCACTAG 


TATTCATCTT 


ATCAACGCTA 


CGACGTTTAT 


5700 


CAATAATAGC GATAGATGTT 


TTCAAAAATT 


CTGCCAACTT 


ACGAGCACGA 


GTCACCCCTC 


5760 


CATGGTCCGG GCTGACAACC ACATAGTCAG AACCAACCAT ACCACGACGC 


TCAAAATAAT 


5820 


CTGCAATCAG AGGAGCACCC 


ATCAAATGAT 


CCACAGGAAT 


ATCAAAGAAT 


CCTTGAATTT 


5880 


GCGCAGCATG CAAGTCGATG 


GTCAATAAAC 


GATCCACTCC 


AGCTACTTCA 


AGCATATTTG 


5940 


CGACAAGTTT TGAAGTGATT 


GGCTCACGCG 


CTCTCGCCTT 


TCTATCCTGA CGTGCATACC 


6000 


CATAGTAAGG CATGACAACA 


TTGACAGATT 


CTGCACTCGC 


ACGCTTCAAA 


GCATCTACCA 


6060 


TAATCAAAAT TTCAAGCAGA 


TTGTCATTTA 


CAGGCGAACT 


AGTTGATTGT 


AAGATAAAGA 


6120 


CGTGTTTCCC ACGGATTGAT 


TCTTCAATGT 


TGACCTGAAT 


CTCTCCATCT 


GAAAATTGGC 


6180 


GAACACTTGA TTTCCCCAAC 


TCTATCCCAA 


TCTCCTGCGC 


CACACGTTCT 


GCCAATTCTT 


6240 


TATTAGAAGA AAGGGCAAAC 


AGCTTTAAAT 


CAGAAAAAGA 


CATGATTTCC 


TCCGGTATAT 


6300 


ATGTATAACT TGTGCTTTTC 


ACAAGATTTT 


CCATCTACCA 


TTGTAGCGGT 


TTTTGCACTA 


6360 


TTTTTCAATC AAAAATAAAA 


GAAGGGCACC 


ATATTTGTAC 


CCTTGCATCA 


TTCTTTTGAA 


6420 
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AAATATTCTA GGTCATCAAC TCATTGTGTT TCTCAACAAA GCAATAAGCA TGATAAAAAC 6480 

CAT AG AG AG C AATAGCCGTA ACCACTGGAA TCGCTAAAGG CAACTCTGTT TCCAACTCCA 6540 

CAAAAGGAGA GTTAAACAAG AAGTGAGTTC CCAAGGCTAA ACCTAGAAAA ATAAGGCCCT 6600 

GTTTCTTGCC AACCTTCTGT CGTTTATAGG CTCTGTAAAG CAAGTAAACA CCTACTACAG 6660 

CTAGACCTGA AAAAGTCCAG TGAGAGGCAA TTCCTGAGAT GATACGCTCT AAAATTCGCG 6720 

AAATAGTAAA GTCAAAGCCC TCTGGCAAAT CCGTACGAAT ATAACCAATA TCCTTAATCA 6780 

TTTGGAATCC CAAACCGGAA GCAATTCCAA GTAAAAACAA AGATTTTAAT TTTCGCACAG 6840 

GAATCAAAGC CAAAACAAAA ACAAGTGACA ATAATTTCAA GGGTTCTTCT ACCAAAGGAG 6900 

CCGCAATAGC ACTTTCAAAG GCATTTAAAA ATGGACTATC TGGGAAAAGA ACCCCCAGTA 6960 

AATCATGGAT ATAAGTATTA GCAAAACTAG ACAACCAGCC TGAAAGGAAC ATCCCTCCCA 7020 

ATAAAGACAG AATCAAAACC TTCTTTGGCA ATTCCCATTT TTCCCAATAC GGAAGAGAAA 7080 

ATAAAGAGCC GGAATCATGT AAAAGAGAGC TAGAAAGATA GAAACTCCCA TTAGTCCATA 7140 

TTCCGCACCT GACCTCGAAC CGTCCGTATA GTAGATGGTT TCATACTGTA AACCAATACA 7200 

TAGCAATAAA ATAAAAATAA ATAAAATATT GCTTTTCTTC ATACACTTTC TTTCTAAATG 7260 

AAGTATTTAT AATTCTACGA CTGTCATACT TCCTGTATCA ACATTGTAAA TGGCACCAGA 7320 

GATAATGACA TCGTCTGGTA TTAGGGGAGA CTCGATAAGC AGTTGCATAT CCTCGCGTAC 7380 

ACTCTCTTCT ATATCTTGGA AGGGCAAGAA GTCCTGGTCT GACACATCGA CACCCAATTC 7440 

TTCCTTCAAA TACTCCTGAA AAGGTTCATT TTCAAAGGTC TGAGCACCAC AGTCTGTATG 7500 

ATGCAATACC ACAATTTCTC TTGTCCCCAT TTGTTGCTGG GAAATAACTA GAGAACGAAT 7560 

CATATCCTCA GTCACTCGAC CACCTGCATT CCGCAAAATA TGAGCATCCC CAAGTGCCAA 7620 

ACCTAGAGCT TGCGCAACGT GCAAACGTGA GTCCATACAG GTCACAATGG CTACTCTGGT 7680 

TTTAGGTTTA AGTGGCAGAT TTAACTGCCC ATGTAGGGCA ACATAAGCCT GATTGGCTTG 7740 

CATAAACTGT TCAAAATACG ACACGATTCC CTCCTTGAAA ATTTGATAGT CAAATATTTC 7800 

TCCTATCTTA TCATTTTTAA GAGAATTTGT CACGGATTAT GCAAAGACCT TTTTCAAGAC 7860 

TTCCTGAATC GTTGTCACGC CAATGACCTG AATTTCCTTA GGCAGAGTGA TTCCTGTCAA 7920 

GGAATTCTTA GGTACATAAA TCTTAGTAAA GCCCAGTTTA GCAGCTTCGT TGATGCGTTG 7980 

CTCAATACGA TTCACGCGCC GAATCTCTCC TGTCAAGCCC AGTTCTCCGA CAAAACATTC 8040 

CTGAGGATTA GTTGGCTTGT CTTTGTAGCT CGAAGCAATA GCAACTGCAA CAGCCAAGTC ^8100 

AATCGCAGGT TCATCCAATT TAACACCACC AGCAGATTTG AGATAGGCAT CCTGATTTTG 8160 

CAAGAGAAGC CCTGCCCGTT TTTCCAAAAC AGCCATAATC AAGCTAGCAC GGTTAAAATC 8220 
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AAGTCCTGTC 


GTAGTACGCT 


TGGCATTTCC 


AAACATGGTC GGTGTTACCA 


AAGCCTGAAC 


8280 


CTCCGCCAAA 


ATCGGACGCG 


TCCCTTCCAT 


GGTTACAACG ATGGAGGAAC 


CAGTCGCCCC 


8340 


ATCCAAACGC 


TCTTCTAGGA 


AAACTTGACT 


CGGATTGAGT ACCTCAACCA 


AGCCGCCCGA 


8400 


CTGCATCTCA 


AAAATCCCAA 


TCTCATTAGT 


GGAACCAAAA CGATTTTTGA 


CCGCTCTCAA 


8460 


AATACGAAAG 


GTGTGGTGAC 


GCTCCCCTTC 


AAAGTAAAGC ACCGTATCCA CCATATGCTC 


8520 


CAACATACGA 


GGCCCAGCCA 


AGGTTCCTTC 


TTTGGTCACA TGACCTACGA 


TAAAGATGGC 


8580 


AATGTTATTG 


GTCTTGGCCA 


ACTGCATGAG 


TTCAGCGGTC ACTTCACGCA 


CCTGAGAAAC 


8640 


AGACCCCTGC 


ACCCCTGAAA 


TCTCAGGAGA 


CATGATGGTC TGGATGGAAT 


CAATAATGAG 


• 8700 


AAAGTCTGGC 


TGGATACGCT 


CCACTTCTGC 


ACGAACACTC TGCATATTGG 


TCTCTGCATA 


8760 


GAGATAAAAC 


TCACTATCAA 


TATCACCTAA GCGCTCTGCA CGTAGTTTAA TCTGCTGGGC 


8820 


AGACTCCTCC 


CCACTGACAT 


AGAGAACTGT 


CCCCACTTGG GACAACTGGG 


TTGAGACTTG 


8880 


TAGGAGAAGA 


GTTGATTTCC 


CAATCCCAGG 


ATCCCCACCG ATAAGGACGA 


GACTTCCTGG 


8940 


TACCACTCCG 


CCTCCAAGCA 


CACGGTTGAA 


TTCCTCCATC TCCGTCTTGG 


TTCGATTGAG 


9000 


ATTGATGGAA 


GTCACCTCAG 


CTAGTTTCAT 


GGGCTTGGTT TTCTCACCTG 


TCAAGGACAC 


9060 


ACGCGCATTC 


TTAACTTCGG 


CAACCTCAAC 


CTCTTCCACA AAAGAAGACC 


AAGACCCACA 


9120 


GTTGGGGCAA 


CGTCCCAGAT 


ATTTAGGGGA 


ATTATACCCA CAATTTTGAC 


ATACAAATGT 


9180 


CGCTTTTTTC 


TTTGCGATGA 


CAAACCTCTT 


TCTATATCTC TAACTCACAC 


TCAATCACTT 


9240 


GGCAAAAATC 


AATCTTCTCA 


TTTGGCACAA ACTGGCGCAT GAGCATTCGA TGAGCAACAA 


9300 


CTACCACAGT 


CTGATGTTCT 


CGATACTTAG 


ACATACATTC TAGAAACCGA 


GACTTCATTT 


9360 


CCGTAGCTGT 


CTCATATTGA 


ATAGGACTAT 


TAGGAAGCAA CTCCCCCTTG 


TTTTCTAAAA 


9420 


ACAGTCTTCT 


AGCTGTTTCA 


AAGTTTTCTA 


TTCCTGTTTT ATAGACCTGC 


CATTCATGTA 


9480 


ATAAAGGCTC 


TACTGTTAAA 


GGAAGACCCG 


TAGCACAGAC CACATACGAA GCCGTTTCTA 


9540 


AAGCTCTTGT 


GACTGCAGAA 


GATACGATTA 


TTTCAGCTGA CGAGAGTAAA 


GGATTTTTGC 


9600 


TCAATTTCTG 


GACTTGCTGC 


CGTCCCATCT 


CAGACAAGGG TGCCAAATCT 


ATCCCAAATC 


9660 


CTATATAAGA 


ACGCTCCTCT 


AACTCACGGT 


AATCTGGCTC CCCATGACGT 


ACAAAGATAA 


9720 


TCTTCATTCT 


AGTGCCCTGT 


CGATCCAAAT 


CCACCAGTTC GAACGCCATC 


AGCTGCATCT 


9780 


CCATCTGCAA 


TTAAGAAAGT 


AGCAAAAACA 


GCCTGGACAA TACGCTCCCC 


AACTTCAAGA 


9840 


ACAACCTCTT 


GGTCTGTGAT 


ATTCTTCATC 


TGCGCAAAAA TATGCCCTTC 


ATTTGCAGGA 


9900 


TTTCCATAAT 


AATCCCCATC 


AATGACTCCA 


ACTGAGTTAA TTAAAACCAA 


GCCCTTCTTA 


9960 
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CGAGGATTTG AAGAACGATC ATAGAGGTAG AGAACCTCAG TCGGCTGCAT ATAAGCCTTA 10020 

ACCCCTGTCG GAACCAAGAC AATCTCTCCT GGCGCAACAA CTGTACGCAC AGCAACCTTT 10080 

AAGTCGTAAC CAGTCGCATG CGCTGTCTCA CGCTTGGGCA ATAAATTTTC ATCTGTAAAA 10140 

CTCGAAACCA ATTCAAAACC ACGAATTTTC ATAATTTTCT CTTTTCTATT ATCATTTATT 10200 

CTAGATTATT CTATACTTAT TTA 10223 
(2) INFORMATION 'FOR SEQ ID NO: 74: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16535 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 
(D> TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

TGGTTCTGTC CTTATCGGCG CCTTGTCTTG CTTGCCATGG CTACACCAAC TATCTCATCC 60 

GACGAAAGTA CACCAACCAC TAACGAACCC AACAACAGAA ATACAACCAC CCTTGCCCAA 120 

CCTCTTACTG ATACAGCAGC TGGCTCTGGT AAGAACGAAA GTGATATTTC TTCACCTGGA 180 

AATGCAAACG CTTCCCTAGA GAAAACAGAA GAAAAACCTG CTGCAAGCCC AGCCGATCCA 240 

GCACCACAAA CTGGACAAGA TCGTTCAAGT GAGCCAACTA CTTCTACTAG TCCAGTAACA 300 

ACTGAAACTA AGGCAGAAGA GCCCATCGAA GATAACTACT TCCGTATCCA TGTCAAAAAA 360 

CTTCCTGAAG AAAACAAGGA TGCTCAAGGA CTATGGACTT GGGACGATGT TGAAAAACCA 420 

TCTGAAAACT GGCCAAACGG AGCTTTGTCC TTCAAGGATG CCAAGAAAGA TGACTACGGC 480 

TATTACCTAG ATGTCAAATT AAAGGGAGAA CAAGCCAAGA AAATTAGCTT CCTCATCAAC 540 

AATACAGCTG GAAAAAATCT AACCGGCGAT AAATCTGTAG AAAAACTAGT TCCAAAAATG 600 
AACGAAGCTT GGTTAGACCA AGATTACAAG GTTTTCTCTT ACGAGCCACA GCCTGCAGGA , 660 

ACTGTTCGCG TCAACTACTA CCGCACAGAT GGCAACTATG ACAAGAAATC TCTCTGGTAC 720 

TGGGGAGATG TGAAAAATCC AAGTAGCGCT CAATGGCCTG ACGGAACAGA CTTTACGGCT 780 

ACAGGCAAAT ATGGCCGCTA TATCGACATT CCTCTTAATG AAGCCGCAAG AGAATTTGGA 840 

TTTTTATTAC TAGATGAGAG CAAACAAGGA GACGACGTGA AAATCCGTAA AGAAAATTAT 900 

AAGTTCACAG ATTTGAAAAA TCATAGCCAA ATTTTCCTAA AAGACGATGA TGAATCGATT 960 

TACACAAATC CATACTATGT CCATGATATC CGTATGACAG GAGCCCAACA CGTAGGCACT 1020 

TCTAGCATTG AAAGTAGCTT TTCAACACTT GTCGGTGCTA AAAAAGAAGA TATCCTCAAA 1080 

CACTCCAACA TCACTAATCA CCTAGGAAAC AAGGTAACTA TTACCGATGT TGCAATCGAT 1140 
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GAAGCTGGTA AGAAAGTGAC 


CTACAGCGGA GATTTCTCTG 


ACACAAAACA 


TCCTTATACT 


1200 


GTTAGCTACA ATTCCGACCA ATTCACTACC AAAACAAGCT 


GGCGCCTGAA 


AGATGAGACA 


1260 


TACAGCTATG 


ATGGCAAACT 


GGGAGCTGAC 


CTAAAAGAAG 


AAGGAAAACA 


AGTTGATTTG 


1320 


ACCCTTTGGT 


CACCAAGTGC 


TGATAAGGTT 


TCTGTTGTTG 


TCTACGACAA 


GAATGACCCT 


1380 


GACAAAGTAG 


TTGGAACTGT 


CGCTCTTGAA 


AAAGGGGAAA 


GAGGAACTTG 


GAAACAAACT 


1440 


CTAGACAGCA 


CAAACAAACT 


CGGAATCACA 


GATTTCACTG 


GCTACTATTA 


TCAATACCAA 


1500 


ATCGAGCGTC 


AAGGTAAAAC 


TGTTCTTGCA 


CTCGATCCTT 


ACGCTAAATC 


TCTTGCTGCT 


1560 


TGGAATAGCG 


ACGATTCCAA 


GATTGACGAT 


GCCCATAAAG 


TGGCTAAAGC 


CGCCTTTGTA 


1620 


GATCCAGCTA 


AACTCGGACC 


TCAAGACTTG 


ACTTATGGTA 


AGATTCACAA 


TTTCAAGACT 


1680 


CGTGAAGACG 


CCGTTATCTA 


CGAAGCTCAT 


GTGCGTGATT 


TCACTTCAGA 


TCCTGCCATT 


1740 


GCAAAAGACT 


TGACCAAACC 


ATTTGGGACT 


TTTGAAGCCT 


TCATTGAAAA 


ACTAGACTAT 


1800 


CTCAAAGACT 


TGGGTGTAAC 


CCATATCCAG 


CTCCTTCCAG 


TCTTGTCTTA 


CTACTTTGTC 


1860 


AATGAATTGA 


AAAACCATGA 


ACGCTTGTCT 


GACTACGCTT 


CAAGCAACAG 


CAACTACAAC 


1920 


TGGGGATATG ACCCTCAAAA CTACTTCTCC 


TTGACTGGTA 


TGTACTCAAG 


CGATCCTAAG 


1980 


AATCCAGAAA 


AACGAATCGC 


AGAATTTAAA 


AACCTCATCA 


ACGAAATCCA 


CAAACGTGGT 


2040 


ATGGGAGCTA 


TCCTAGATGT 


CGTTTATAAC 


CACACAGCCA 


AAGTCGATCT 


CTTTGAAGAT 


2100 


TTGGAACCAA 


ACTACTACCA 


CTTTATGGAT 


GCCGATGGCA 


CACCTCGAAC 


TAGCTTTGGT 


2160 


GGTGGACGCT 


TGGGGACAAC 


CCACCATATG 


ACCAAACGGC 


TCCTAATTGA 


CTCTATCAAA 


2220 


TACCTAGTTG 


ATACCTACAA 


AGTGGATGGC 


TTCCGTTTCG 


ATATGATGGG 


AGACCATGAC 


2280 


GCCGCTTCTA 


TCGAAGAAGC 


TTACAAGGCT 


GCACGCGCCC 


TCAATCCAAA 


CCTCATCATG 


2340 


CTTGGTGAAG 


GTTGGAGAAC CTATGCCGGT GATGAAAACA TGCCTACTAA AGCTGCTGAC 


2400 


CAAGATTGGA TGAAACATAC 


CGATACTGTC 


GCTGTCTTTT 


CAGATGACAT 


CCGTAACAAC 


2460 


CTCAAATCTG 


GTTATCCAAA 


CGAAGGTCAA 


CCTGCCTTTA 


TCACAGGTGG 


CAAGCGTGAT 


2520 


GTCAACACCA 


TCTTTAAAAA 


TCTCATTGCT 


CAACCAACTA 


ACTTTGAAGC 


TGACAGCCCT 


2580 


GGAGATGTCA 


TCCAATACAT 


CGCAGCCCAT 


GATAACTTGA 


CCCTCTTTGA CATCATTGCC 


2640 


CAGTCTATCA 


AAAAAGACCC 


AAGCAAGGCT 


GAGAACTATG 


CTGAAATCCA 


CCGTCGTTTA 


2700 


CGACTTGGAA 


ATCTCATGGT 


CTTGACAGCT 


CAAGGAACTC 


CATTTATCCA 


CTCCGGTCAG 


2760 


GAATATGGAC 


GTACTAAACA ATTCCGTGAC 


CCAGCCTACA AGACTCCAGT AGCAGAGGAT 


2820 


AAGGTTCCAA 


ACAAATCTCA 


CTTGTTGCGT 


GATAAGGACG 


GCAACCCATT 


TGACTATCCT 


2880 
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TACTTCATCC 
GCTACAGATG 
ATTGCCCTTC 
CGTGTCCACC 
GGCTACCAAA 
AAAGCTCGCG 
GCAGATGAAA 
GAAAAAGGCT 
ACTAGCCATG 
CAAAATGAAG 
ACTAAACCAG 
TCACAAGCTG 
AACGAATCGG 
GAACTTCCAA 
CTTGCGCTCC 
CCTATAGAAA 
TGTCCTAATA 
TACTTTGGGT 
CATAGAACCA 
AATCAATTTT 
CTGCTCCAAA 
ATCCACCACT 
AATCCACCCC 
CCAATCTCAC 
GCATATACAA 
GTAGCTATCC 
ATCGTTTGAG 
AAGTCTAATT 
AAAGATTCCG 
TCGATAGTGG 



ATGACTCTTA 
GTAAAGCTTA 
GTCAATCTAC 
TCATCACTGT 
TCACTGCTCC 
AATTTAATTT 
ACCAAGCAGG 
TGAAATTGAA 
AGTCAACTGC 
CTTCTCACCC 
ATGCCAAAGT 
AACAACCAGC 
TAGAAAACTC 
ATACAGGAAT 
TTGGTCTCGG 
AATCCCCCAA 
AACTTGATTA 
GCAACTTGTG 
AGCGGTAGAT 
TCGGCCACCT 
TATTGCTCAT 
TGCTTAGTCG 
CAAGAATGGG 
GATAAGCCTG 
TCGGTAGACC 
ACAAGGGCAA 
TCAATCGACC 
TCTCATCAAA 
TGAAAGAGCC 
CATACTGTTG 



CGATTCTAGT 
TCCTGAAAAT 
AGATGCCTTC 
CCCAGGCCAA 
AAACGGCGAT 
GGGAACTGCC 
ACCAGTCGGA 
TGCCCTTACA 
AGAAGAGAAA 
TGCACATCAA 
AGCTGATGCG 
ACAAGAAGCA 
TAGCAAGGAA 
CAAAAACGAA 
TTTCTTACTA 
GCATTATAGC 
GGATTTTTTA 
TTCCGAAGAG 
GAAGCATGAA 
GATCTGGATT 
AACGCAATTC 
GATGGAAATA 
CTCCCACTTT 
AATCAACTTT 
AGCCTGAGCA 
TTTGTCCTGA 
TTGCCAGTCT 
AAGAGAGTCG 
CCTTCCAGCC 
GAACAAACGA 



610 
GATGCAGTCA 



GTCAAGAGCC 
CGACTTAAGA 
AATGGTGTGG 
ATCTACGCAG 
TTTGCACATC 
ATTGCCAACC 
GCTACTGTTC 
CCAGACTCAA 
GACCCAGCTC 
GAAAATAAAC 
CAAGCATCAT 
AATATACCTG 
AACAAACTCC 
AAAAATAAAA 
TCGGGGGATT 
TTAAGCCTCT 
TTCAATAGCT 
GCGGTCCAAT 
GCCAACAAAC 
CTGCCAGTGC 
ATCTTTCACC 
CAAGTCTTTG 
TTAAAATAAC 
ATCTTCACTG 
ACTGGACGAG 
AACTTGGTCT 
TAGTCTTTCA 
ATAATCTCCG 
ATCGGGTCCA 



ACAAGTTTGA 
GTGACTATAT 
GTCTTCAAGA 
AAAAAGAGGA 
TCTTTGTCAA 
TAAGAAATGC 
CGAAAGGACT 
TTCGAGTCTC 
CCCCTTCCAA 
CAGAAGCTAG 
CTAGCCAAGC 
CTGTAAAAGA 
CAACCCCAGA 
TATTTGCAGG 
AAGAGAACTA 
AATTTTTGTA 
TTCATAGCAA 
CTCAGAACCT 
CCTAAATCCT 
ATGGCGCCAT 
GGACGGTCTT 
GCCTGCTCAC 
TCAGCATGGC 
GTGGATTACC 
TTGATTCGAC 
GATAAACTTC 
TTTCATTGAC 
AGTCATAACC 
ATCGTCCATT 
TGCTTGACAG 



CTGGACTAAG 
GAAAGGTTTG 
TATCAAAGAC 
TGTAGTGATT 
TGCGGATGAA 
GGAAGTTTTG 
TGAATGGACT 
TCAAAATGGA 
GCCTGAACAT 
ACCTGATTCT 
TACAGGTGAT 
AGCGGTTCGA 
TAAACAAGCT 
AATCAGCCTC 
AACTAGCCCT 
CAATATTTGT 
AATAAGCTCG 
GGTCATGAGG 
CTATCATGCG 
TTGGCCCTAC 
TGGAAATAGG 
CATCTTCCGC 
CCCTTCGC1T 
ACCAATAATA 
ATGACCACCT 
TTTAGCAGCA 
TAACTGAAGC 
AAACAGAGGG 
TGACAAAGCA 
AATGCTGACT 



2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 
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GCACTGGTCA AACGGATTTT CTTGGTATTG ACTGCCCCAG 


CGGCCAGAAC 


AATCTCTGGG 


4740 


GCTGATACTG 


CAAAATCCGC 


CCGATGGTGC 


TCACCAATCC 


CATATACATC 


CAAACCAACC 


4800 


TTGTCAGCCA 


GCTCAATCTC 


TGCCACCAAC 


TGGCGAATGC 


GTTCAGCATG 


ACTGTAAGTT 


4860 


TGTCCAGTCC 


CTTCAAGCTC 


CGTTATTTCC CCAAATGTTG AAATTCCCAA TTCTACCATT 


4920 


GTGATTCTCC 


TTATCTATCT 


CTGTACTTCA ATTTGAAAAA TTATTCTAAC ACGAATCTTG 


4980 


AGTACAAGCA 


ACCGATTTGC 


TCATTAGAAA 


AAGCCTAGAT 


AACTAGACTT 


TTTTAGCTTA 


5040 


TTCTACCGTT 


ACTGACTTGG 


CAAGGTTACG 


TGGTTTGTCC 


ACATCGAGGC 


CACGGTGGAG 


5100 


GGTTGCAAAG 


TAAGCGACTA 


ATTGCGTTGG 


TACGACCATT 


GAAATTGGTG 


AGAGGTATGG 


5160 


ATGTAGGGTC 


GTAAGGACGA 


TATCGTCGGT 


ATCTTTGGCT 


ACATTCTCTT 


CTGCGATAGT 


5220 


GAGGACTTTG 


GCACCACGGG 


CTGCGACCTC 


TTGGATATTT 


CCACGAGTAT 


GATTGGCAAG 


5280 


AACTGGATCT 


GACAAGAGAG 


CCAAAACAGG 


CGTTCCTTCT 


TCAATCAAGG 


CAATGGTTCC 


5340 


GTGCTTGAGT 


TCTCCTGCAG 


CAAAGCCTTC 


ACACTGGATA 


TAAGAAATCT 


CTTTGAGTTT 


5400 


GAGACTTGCT 


TCCATGGCTA 


CGTAGTAATC 


TTGACCACGT 


CCGATGTAAA 


AGGCGTTAGG 


5460 


AGTTGTTTCA 


AGAAGTTCAC 


GAACCTTGAC 


TTCAATGGTT 


TCTTTCTCTG 


AAAGAGTTGA 


5520 


TTCGATAGAC 


TGAGCTACGA 


TTGACAATTC 


ATGAACCAGG 


TCAAAGGCTT 


GCGGTTTAGC 


5580 


ATTACCATTT 


GCTTCTCCGA 


CTGCTTTTGC 


AAGGAAGGGA AGGGCTGCGA TTTGCGCTGT 


5640 


ATAGGCTTTA 


GTTGATGCCA 


CGGCAATTTC 


AGGACCTGCG 


TGAAGGAGCA 


TGGTATAGTT 


5700 


GGCTTCACGT 


GAGAGGGTTG 


AACCTGGAAC 


GTTTGTCACT 


GTTAAGCTTG 


GAATTCCCAT 


5760 


TTCATTAGCC 


TTGACCAAAA 


CTTGACGACT 


ATCCGCTGTT 


TCACCAGATT 


GGCTGATAAA 


5820 


GATGAAGAGT 


GGTTTGTTGC 


TGAGAAGTGG 


CATACCGTAG 


CCCCACTCAG 


ATGAGATTCC 


5880 


AAGTTCAACT 


GGTGTATCTG 


TCAATTCTTC 


CAACATTTTC 


TTAGAAGCAA 


ATCCTGCATG 


5940 


GTAAGATGTT 


CCAGCTGCAA 


GGATGTAGAT 


GCGGTCTGCG 


TCTTGAACAG 


CCTTAATGAT 


6000 


ATCTGGGTCT 


ACGACAACTT 


GACCAGCCTC 


ATCTGTGTAG 


GCTTGGATGA 


GTTTCCGCAT 


6060 


AACAGTTGGT 


TGCTCGTCAA 


TTTCCTTGAG 


CATGTAGTAA 


GGGTAAGTTC 


CCTTACCGAT 


6120 


ATCTGACAAG 


TCAAGTTCAG 


CAGTGTAGCT 


AGCACGCTCA 


CGACGATTTC 


CATCATAGTC 


6180 


TTGAACTTCC 


ACACTATCAG 


CCTTGACGAT 


TACCAACTCT 


TGGTCATGGA 


TTTCCATGTA 


6240 


TTGGTTAGTT 


TCACGAATCA 


TAGCCATGGC 


GTCTGAGCAG 


ACCATGTTAT 


AGCCTTCTCC 


6300 


AAGACCAATC 


AAAAGTGGTG 


ATTTATTTTT 


AGCTACGTAG 


ATGACTTGAG 


GATCTTGTGA 


6360 


GTCAACCAAG 


GCAAAGGCAT 


AAGAACCACG 


GATGATGTGA 


AGGGCTTTTT 


TGAAGGCTTC 


6420 
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AAGAACTGAG AGCCCTTCTT CTTCCGCAAA TTTTCCAATC AAATGAACGG CTATTTCAGT 6480 
ATCTGTCTGC CCCTTGAAGT GGTGACCTGC AAGGTATTCT TCCTTGATTT CAAGATAGTT 6540 
CTCAATCACC CCATTATGCA CCAAGACAAA ACGTTCCGTC TCAGAGCGGT GTGGGTGAGC 6600 

ATTGTCCTCA GTTGGTTTTC CGTGAGTAGC CCAACGAGTA TGTCCGATAC CAGTTGTTCC ' 6660 
CTCAACACCA GCTGTCTTGG CAGACAATTC TGCAATACGA CCAACCGCCT TCACCAAATG 6720 
GTTATCAGCA CCATCTAGGA CAAAAATTCC CGCAGAATCA TAGCCACGGT ATTCAAGCTT 6780 
TTCAAGCCCT TGAATCAAAA TATCAGTTGC ATTTGTGTTT CCAACAACAC CAACAATTCC 6840 
ACACATAGTA TATACGACAC AGGCAAGCTG TGCTTTCTCC TTAAAATTGG TATAGTCTAA 6900 

TTCATCTTTT ATAGAATCAG CAAAAACAGT ATATACTTGT TTCTTTCACT TGTCAAGAGT 6960 " 

AAAAATTGGT ATAGTTCAAA TTAAGCTCCT GTAAGCATAA AAACTCTGAC CGATTGGGAT 7020 

AATCAGTCAG AGTCCTTTTT AAAATCCATT ATTATCGCTT AATTCTTTGA ACCAGTGGCC 7080 

TGATTTCTTC AGACGACGTT CTTGCGTTTC CAAGTCTAAT TCGACCAAAC CATAGCGATT 7140 

TTTATAGCTG TTGAGCCATG ACCAGCAGTC AATAAAGGTC CAAATCAAGT AGCCCTTACA 72 00 

GTTGGCACCA TCTTCAATGG CACGGTGAAG TTCACGAAGA TGACCTTTTA CAAAGTCAAT 7260 

ACGGTAATCA TCTTGAATCA TTCCATCTTG ACGGAATTTT TCTTCCCCTT CAACACCCAT 7320 

ACCATTCTCA GTCAACATCC ACTCAATATT GCCATAATTT TCCTTGATAT TTTGGGCGAT 7380 

GTCATAAATC CCTTGCTCAT AAATCTCCCA ACCACGGTGA GAATTGATTT TACGTCCAGG 7440 

CATCACATAA GGCTCGTAAA AATGTTCTGG TAAGAGTGGA CTCTCTGGAT GCTTAGCAAA 7500 

TCGAGGAGCC ATAACACGCA AAGGTTGATA GTAGTTCACA CCAAGGAAGT CCACCGTATT 7560 

ATCACGAATG AGTTCCAACT CTTCCTCTGT AGCATCAGGT AAAAGACCGT GTTCATGCAA 7620 

GATTTCTACC AACTCCTGTG GATAAGTCCC CAAGACAGAT GGATCTAAGA AAGATTGGGC 7680 

CTGAAAAAGG GCCGCAATAC GAGCTGCCTT GACATCAGCA GGATGCTGGC TACGTGGATA 7740 

AGCCGGTGTC AAGTTGAGGA CAATGCCAAT CTTGGAATCA GGCAAAAGTT CATGGCAAGC 7800 

CTTAACAGCC CGGCTGCTGG CCAATTGTGT ATGATAGGCT ACCTTAACAG CTGCCTCTGC 7860 

ATCCACCTTA TGTGGATAAT GGGCATCATA AAAATAACCA AATTCTACAG GAACGATGGG 7920 

CTCGTTAAAG GTAATCCATT GATCCACTAA ATCTCCATAA GTCTCAAAAC AAAAACGAGC 7980 

ATAGTCTTCA TAGGCTGAGA CTGTCGCCTT ATTTTCCCAA CCATCACCAT CCTCTTGAAG 8040 

GGCAAAAGGT AAATCAAAAT GATAGAGATT GACTAACAGA CGAATTCCTT TAGCCTTAAT 8100 

AGCCTCAAAG ACCTTACGAT AAAAATCCAC ACCTTGAGTG TTGACTTTTC CACAGCCTTG 8160 

TGGAAAAATC CGTGACCACT GAATAGAAGT CCGAAAGGCT GTGTGACCAG TCTCTAACAA 8220 
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AAGCTCAATA 


TCCCGCTCCC AATTTTCATA 


AAAAGTCGAT 


GTCTTATCTG 


AACCAATCCC 


8280 


ATTATAGTAA 


CGATTTGGCT CCACTTGGAA 


CCAGTAATCC 


CAGAGATTGT 


CTCCCTTACC 


6340 


GTCACCAGCT 


ACACGTCCTT CTGTCTGCGG 


TCCAGAAGTA 


GAGGATCCCC 


AGACAAAATC 


8400 


CTTTGGAAAT 


CTTAGCATAC ATTTACCTCT 


TTATCTACTC 


ATTTCTCCCA 


TTATACAGAA 


8460 


AAAACAAGGT 


AAAAACTAGT TACATTTTTT 


CCTTGTTTTT 


CTTCTGATTA 


TAGTTTTTAT 


8520 


TTCTTGCTTA 


GGATTTCAAG CGTTTCAAGC 


ACGTTATCTG 


CATGAACCTC 


AATGGTGTCA 


8580 


CCAGTTGCCT 


TGATCTTAAC TTCTACAATG 


CCATCGGCCG 


CTTTTTTACC 


AACAGTGATA 


8640 


CGGATTGGAA 


GACCAATCAA GTCACTATCG 


CTAAATTTAA 


CACCGACACG 


TTCGTTACGG 


8700 


TCATCTGTCA AGACTTCATA ACCAGCTCCC 


ATCAAGCTTG 


CTTCAAGTTT 


TTCTGTCAAG 


8760 


GCTTGCGCTT 


CTTCATCCTT GACATTGACA 


GTAATCAAAT 


GCACATCAAA 


TGGTGCCAAT 


8820 


TCTTTAGGGA 


AATTGATTCC CCAAGCGTAA 


CGGTATTCAC 


CTTTTGGCGT 


TTTGTTAACA 


8880 


AAGAGGCGAG 


CGTGTTGCTC CATCACTGCT 


GAAAGAAGAC 


GGCTGACACC 


GATACCGTAA 


8940 


CATCCCATGA TGATTGGCAC AGCACGACCA 


TTTTCATCCA 


AGACATCTGC 


TCCCATGCTT 


9000 


GCTGAATAGC 


GAGTTCCGAG TTTGAAAATA 


TGACCGATCT 


CAATACCACG 


CGCAAAGTTA 


9060 


AGGACACCTT 


GTCCATCTGG GGAAATTTCA 


CCCTCACGAA 


CTTCACGGAT 


ATCCACATAT 


9120 


TCTGCAGTAA 


AATCACGGCC TGGGTTCACA 


CCAGTCAAGT 


GGTAGTCATC 


TTCGTTAGCA 


9180 


CCGACAACTG 


CATTGCGAAC ATCTTGTACC 


TTACGATCTG 


CAATAATTTT 


AATATTCTCT 


9240 


GGCAAACCAA 


CTGGTCCAAG TGAACCAAAT 


CCTGCTTGAA 


CAACATTCGC 


CACTTCTTCT 


9300 


TCGCTAGCAA 


CGTCAAAGAA ATCTGCTCCC 


AAGTGATTTT 


TCAACTTGAC 


TTCGTTGAGT 


9360 


TGGTCATTTC 


CAACTAGAAG GGCTGCAACA 


AGCTCACCAT 


CTGCAATGTA 


GAAGAGGGTT 


9420 


TTAATCGTTT GTTCTTCTGG AACATTGAGG 


AAGGCTGCAA 


CTTCATCAAT 


TGATTTAACA 


9480 


TCTGGCGTTG 


CAACACGAGT AACTTCTTCT 


TCAGCGACAA 


CACGGTTGCT 


TGGTTTGTAC 


9540 


TCGTTTGTTG 


CCATTTCTAA GTTAGCTGCA 


TAGCTAGACT 


CACTTGAGTA 


AGCAATGGTA 


9600 


TCTTCACCAG AGACTATCCA TTTGAGCAAT 


TCTGCCTTGA 


TTTCTTCTTG 


CACTTCTGCA 


9660 


GGAATTTCGT 


CAAATGAGGC AACTGACTTG TCCAAGACAA CCCAGCGGTC AAGGTCTGTA 


9720 


CGAGCAGATG 


TAATGGCCAT AAATTCTTGG 


CTATCCTTAC 


CACCCATGGC 


TCCACCGTCA 


9780 


CCAATAATAG 


CCTTGAAGTC TAAACCACTA 


CGAGTGAAAA 


TACGCTCATA 


GGCTGCTTTG 


9840 


TACTCATCAT 


AAACACTATC CAAACTATCA 


TAGTTAGCGT 


GGAAACTATA 


AGCATCCTTG 


9900 


ATGATAAACT 


CACGTGTACG AAGAAGTCCA 


TTACGCGGGC 


GTTTTTCATC 


ACGATACTTG 


9960 
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GGCTGAATTT GATAAAGGTT GAGTGGCAAT TGCTTGTAAG ATTTAACAGA ATCACGGACA 10020 

ATAGCTGTAA AGGTTTCTTC GTGAGTTGGA CCTAAGATAA AGTCTGATTT TTCACGGTTT 10080 

TTTAGTTTGT AAAGGTCTTC ACCATAGGTT TCGTAACGAC CTGATTCACG CCACAATTCT 10140 

GCACTAAGAA GGGCTGGAGC CAACATCTCA ACAGCACCAA TCTTTTCGAA TTCTTGGCGC 10200 

ATGATGTTTT TAGCTTTTTC AATCACACGG TTGGCAAGTG GTAGATAAGA ATAAACACCT 10260 

GCTGAAACTT GGCGAACATA ACCAGCACGC AACATAAGAG CATGGCTGAT AACTTGAGCA 10320 

TCGCTTGGCA TTTCGCGAAG CGTTGGGATA GGCATTTTAC TTTGTTTCAT AATATTCCTC 10380 

GATTATCTAA AAAAGAGTCG CATAATGTCA TTCCAAGTCA CAGCAATCAT CAAGACAACC 10440 

ATGATGACCA CTCCGGCCAA GGTGACATAG GTTTCAATTT CTTGTTTCAA TGGTTTGCGG 10500 

CGGATGGCTT CTAGGATATT GAGCACAATC TTACCACCAT CCAAGGCTGG AATCGGAATA 10560 

AGATTAAAAA TCCCAATATT GATGGAAATC ATTGCCAAGA AGTACAAGAT ATTTTCAATT 10620 

CCATTTTTAG CAGCATCACT ACTTGCCTTA AAGATAGCAA CAGGTCCACC CAACTTGTTC 10680 

AAATCTGGTT GGAAAATCAG ATTTTTCAGA GCTGAGAGAA TTCGGAGAGC TGAGTCAGCA 10740 

GCAGTTGTAA AACCACCTAC AAACATGGAT AGAAAATCTG ACTTAACCCC CGGTTGAACA 10800 

CCTAGAAGGT AACGACCTTG ACTATCTTTG GGTGTAACAG TGACTTGTTT GTCACTCCCC 10860 

TTTTCAGAAA TAGTCACATC CAAAGTCGGT GCCGTCTTAT CTTTGGTTTC TGTTTCCACA 10920 

GCTTGGATCA AGCTTTCCCA GTTGCTAACC TCATGTGAGC CAATCTTGGT AATTTGTGCC 10980 

ATTTCTGGTA CTCCTACCTT GGCCAAGGCA CCTTGGGGCA TGATATGGAA CTGATTGGTA 11040 

TCAACATCTC TGACACCACC CTGCATAAAG ATTAAAACCC AAAAAACAAC GACACCTAAG 11100 • 

ATAAAATTGT TCATAGGACC TGCAAAATTG GTAATCAGTT TGCCCCAGAT AGTCGCATTT 11160 

TGATATTGAA CATCTAAAGG TGCAATCCGA ACCTCAGTAC CATCTGCTTC CACAACCGTT 11220 

GCATCGTGAT CCACTGCAAA TGTTTTTTCT TCTTCCAGAA CCAATCCTTT GATAAAGAGC 11280 

TTGTCTTCAA AATCAAACTG GGTCACCTGC ATAGGGAGGG CTGTTTGATC CAATTTTTTA 11340 

CCTGAGAGAT TGATGCGTTT AACCTTACCA TCATCAGCAA GTGTCAAACT AACAGGCGTT 11400 

CCTGTCTTGA TTTCAGTTGT ATCATCACCC CAACCGGCCA TGCGGACATA GCCACCCAGA 11460 

GGCAAGATTC GAATGGTATA GGCCGTTCCA TCCTTGCCAA TGTGAGCAAA AATTTTAGGT 11520 

CCCATACCGA TGGCAAATTC ACGTACTAAA ATCCCTGATT TCTTGGCAAA GTAGAAGTGA 11580 

CCGAACTCGT GCACCACTAC AATAATCCCG AAAACCAGAA TAAAGGTTAA AATTCCGAGC 11640 

ATAGCGTTTC CTCCGTCTTT TGATTAAAAG AGTCCAAATA AGTGCATGAT TGGAAATACA 11700 

AGCAACATAC TATCGAAACG ATCCAAAACA CCACCATGTC CAGGGATAAA TTTCCCAGAA 11760 
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TCCTTAACAC CAAAATGACG TTTGATCGAA CTTTCTAGTA AATCACCAAA TTGTCCAGCA 


11820 


ATGCTAAAGA AAATAGCAAA GACTGACATC TTGTAAATTC CATATGGAAG AGCAACTGTA 


11880 


CTGTCAACTA TCATAAGGAT AATGGTTACT AAAATTGCTC CTAAAATACC ACCCAAGGCA 


11940 


CCCTCAAGGG TTTTATTAGG 


CGATACCCTT GGTGCTAAGT 


TTCGTTTCCC 


ATAGTTCATC 


12000 


CCAACAAGAT AGGCACCACT 


GTCTGTCGCC CAGACGATAC 


ACAAGGCTAA 


GAGAGCCTTG 


12060 


TCCAAACCTG CAACACGAGC 


ATCTAGTAAA GCATTAAATC 


CAAAGCCCAC 


GTAGAAGCTC 


12120 


ATAGCAAGAG GGAAAACCGC ATCCTCAATC GTATAAGACT TGCTAAAAAC GGTCGTTCCT 


12180 


AACATGATTG AAATCAAAAC 


ACTATAGGCA ACCACATTCC 


CATCAACTGG 


CAAAAAAGTG 


12240 


AGGTAATTCT CCAAGGGAAT 


GGTCAATGCA AAGGTTGCAA 


AGAGGGTCAA 


GAGGCCCTCC 


12300 


ATCGTCATGG TCTCTAGACC 


TCTCATCTTC AAAAGTTCAT 


GCATGGCTAG 


CATGGCTATG 


.12360 


ATTCCGATTG CTATCTGAAG 


CAAGAGGCCC CCAATCATTA 


AAATTGGTAG 


GAAAATAGCC 


12420 


AGGGCAATCC CTGCAAACAA 


GGTTCTTTTC TGTAAATCCT 


GGGTCATATT 


TCCTCCTAAA 


12480 


CTCCTCCAAA TCGGCGATGA 


CGACGATTAT AGGCAAGAAT 


AGCTTCCTGC 


AAGGCCGCTT 


12540 


CGTCAAAATC AGGCCATAAG 


GTGTCCGTAA AATAAAGCTC 


ACTATAGGCT 


CCCTGCCATG 


12600 


GAAGGAAATT GCTCAAACGT 


AATTCTCCAC TAGTACGGAT 


AATCAAGTCT 


GGGTCTCGTA 


12660 


AGTCCTTAGG CAAATGCTGA 


GTAAAGAGAT AGTTACCAAT 


CAATTCCTCT 


GTGATGTCAC 


12720 


CTGGGTTGAT TTTGGCATCT 


AAAACATCCT GGGAAATCAA 


GTTAAGCGCC 


TGTGTAATCT 


12780 


CAGCACGTCC ACCATAGTTA 


AGAGCAAAAT TAAGAATCAA 


TCCTGTGTTG 


TTCTTAGTCA 


12840 


ATTCCTCAGC CTTGGTTAAA 


GCTTCAAAGG TTTGCTTAGG 


CAGGCGGTCT 


GTCTCGCCAA 


12900 


TCATTTGAAT CTTAACATTA 


TTCGCATGTA GTTCCGGGAC 


ATAATTATCA 


TAAAACTCTA 


12960 


CTGGCAAGTT CATGATAAAC 


TTGACTTCCT GATCTGGACG 


GGTCCAGTTT 


TCCGTAGAAA 


13020 


AAGCATAGAC CGTAATAACC 


TTGACGCCCA GTTTGTTGGC 


TGCCTTGGTC 


ACGGTTTGCA 


13080 


ATGCTTCCAT GCCCGCCTTA 


TGTCCAAAAA CTOGCGGTTG 


CATAGGTTTT 


TTAGCCCAAC 


13140 


GGCCATTGCC ATCCATGATG 


ATGCCGATAT GAGCAGGAAC 


CTGTGTCGGA 


ACCTCTACTT 


13200 


CCACAGCCTT ATCTTTCTTA 


AAAAATCCAA ACATGATCTT 


ATTCCTATTC 


AAAAATCTAT 


13260 


CGTTTCATTA TACCATATTT 


CCCCATTTTC TTCTATCACT 


AAGCTATTTA 


TTCTCAGGCA 


13320 


CCAAGCCCAT TTTTCAAAAA 


AATAAGCCGC CTGATTGGGC 


GACTTTATTT 


TTATAGGGAG 


13380 


ATTATTATGA AAAAGTTTTA 


GGAGTTTAAG TTAAGGTCTT 


GTTAACTTAT 


GAACTTAGTG 


13440 


TACACTCCCT AGCTTAAAGT 


TTCCTTAAGT ATTTTTAAAA 


ATCAAATTTT 


TCCATTTCTC 


13500 
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CTGCCAATTT TTCTTGGATA AACGTGTTTG ATAGAGTTCC ATTCGGTCTT CATTTTCTAA 13560 

GAAATGAGGA GTTGGACGAA CTTGAAAATT CAAAATATCC TCCAAACCAT AAGGTACATA 13620 

GAGTTCAAAA TCTAATTCTT CATTCAAGCG CAGTCCAACT GCCGTACACC GTTCTGGATA 13660 

CTTACTCATA GCATCACGAG AACTGGTATA GGAAGCAGTG TGAGGACTGT GCTGATGCAT 13740 

ATAGACCTGA TTTTTCAATT CCCACTGGTA CTGAGGAAAA TCCTCTCTCA GCTTTTTCTC 13800 

CAGTAATAAG GTTTCCTCAT AAGAAAAATC TGGATCAAAG AAAATCACAT CTATATCTGT 13860 

TTCATGATCA AAAGGGGATT TGTCTGACAA AAGATTCCAG ATGAAATTTC TGACAGAACC 13920 

TGCTGCCAAC CACGAGTCTT TCAAACCAAG GTCTCGGATG ATCGTCAGAA TGGCCATCAT 13980 

ATCTGGACTT TCTCTAAAAG CCTCTAAGAT TTCTTGCTTA TTTTTCACTG TATTCATAAC 14040 

CTAAGTGCTC ATATGCCTTA GCAGTCGCCA CCCGTCCAGA CCGTGTCCGC ATGATAAAAC 14100 

CTTTTTGAAT CAAGTAAGGC TCATACATGT CTTCAACTGT CTCACGCTCT TCGGCGATAT 14160 

TCACAGAAAG AGTTCCTAGA CCAACAGGTC CTCCACTGTA CATCTCAATC ATGGTGCGAA 14220 

GGATTTTTTG ATCCACATAG TCCAAACCTT CATGGTCAAC ATCCAGCATA GTCAAAGCCT 14280 

TATCGGTAAT AACATCATCG ATAACCCCAT TCCCCATTAT CTGGGCAAAA TCGCGCACGC 14340 

GCTTGAGGAG ACGATTGGCA ATACGAGGGG TTCCACGACT ACGTAGGGCC AACTCAGATG 14400 

CTGCCTCATG GGTGATTTCC ATCTCAAAAA TATCTGCCGT CCGCTCGACA ATTTCTGTCA 14460 

AGTCAGCATG AGCATAATAC TCCATATGAC CTGTAATCCC AAAACGTGCC CGTAGTGGAT 14520 

TTGAGAGCAT ACCAGCCCGA GTCGTCGCAC CAATCAAGGT AAAAGGAGGC AACTCCAAAT 14580 

GAACACTGCG ACTGCCTTCA CCAGCCCCAA TCATAATATC GATGTAGAAG TCCTCCATGG 14640 

CACTATAAAG CACTTCTTCC ACTGACATGG GTAAGCGATG AATCTCGTCA ATAAAGAGGA 14700 

CATCTCCAGG CTCTAAATCA TTCAAAATCG CTACCAAATC ACCCGCTTTT TCGATAACAG 14760 

GACCAGACGT TTGCTTGAGA TTGACTCCCA GTTCATTGGC AATGACAAAA GCCATGGTTG 14820 

TTTTCCCAAG CCCTGGAGGG CCAAATAAGA GCACATGATC CAGCGCTTCA TCCCGCATTT 14880 

TAGCGGCTTC GATAAAGATC TGAAGTTGAT CCTTAACCTT ATCCTGACCA ATATATTCAC 14940 

GTAAATACTG AGGACGGAGC GTGCGTTCTA CTAACTCCTC ATCACCCATC ATCTCATTAT 15000 

CTAAAATTCT ACTCATGGCT CTATTATATC AAAAAAAACA AGCCACAAAC AAAAAAGCCA 15060 

CCTGATTGGG TGACTCCTAA GTTTAGCACT TATGTGGTAT AATATTATAC GGCACTTCTA 15120 

CACCGCCTAC GAAAGGAGGT GAGATAGCCC ATGATGGAAT TAGTACTCAA AACTATTATC 15180 

GGACCAATTG TGGTCGGTGT CGTTCTTCGT ATAGTCGATA AATGGCTAAA CAAGGACAAA 15240 

TAGTGTCAAA AAAGACCTCA AGCTTATTTG GTCGTGAGCT TGGGGTCTTT TCTAGCCTAT 15300 
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GATATAGAAC TAGTACTCAA TTCCTTTTTA TTATCCCATA GTTCACGAAT TTTGTCAAAA 15360 

CTTTACATTT TCTTCAACCG CTGTACGACA AGACGGTTAA GATTAAGAGA ACGTTAGGGA 15420 

TTCTATCAAT TTCATAGAAA TTTTGATTTC GTAAACGAAG AGACAATCTT ACATGTCACT 15480 

TCTCATTTAA TACGCCACTA CTAGACAAGC AAAATCATTA TTACAGTAGT TCCAGTCCTT 15540 

CAATTAACAG TCACTTACAA TCAAATTGAG TTTGAACTAG CTGAAGCGAC CACAGACCTA 15600 

TTTCTTAGTC ATATTCGCTA AAAAAATCCC CGCCAAAATC TCAAAAAGTC CCCGCCAATT 15660 

CCCCGACCAA AATCCGAAAA ATACCGAAAA ATATCGAAAA ATTATTTTTA GAATAGTCCC 15720 

AAAAATCCTG AAATAGAGCT AAAAAACTCC ACCTGATTCG GTGGAGTTAA GGGAGATTAT 15780 

TATGAAAAAG AAAAGTTTAG GATTTTATTA AATAAAGTTA GGAGGTCTTT ATTTAATAAC 15840 

TACATGATAC AAGACGAAAC TTAAAACTAG CTTAACTTTT CTAAAATTTT ACTATTTTGC 15900 

AAAAAATTTC TATCACCAGC ACCTCACCAA TCGAGTAGGG GATAATCTCT AGCCCCTCTC 15960 

ACACCACCGT ACGTGCCGTT TGGCATACGG CGGTTCAACT AACTTTTAAC GCATGTCGTT 16020 

CAAGGTAATA ATCCAAACAC GAAACCAGTC CACGTTTTTC CAGGACTGGT TTTGATATAG 16080 

CACGTTTAAG TACCGACTTC TGAGCTACTA ATTGATAATG GTCGCGCCAG CCAGATACCT 16140 

TATCTGCTAT CCATTTAGGA ACTCCTAACT TAAGCAATCC CCATAATCGT CTCGATTTCT 16200 

TCTTCCATTG CTTCCAGATA ATCACTCGTA GGCGAGTACG CAAGCGCTCA TCTATGCTGG 16260 

CGACTATACT TTTCATATTT CCCAATGAGC AATAGTTTAT CCATCCTCGA ATAGACAAAT 16320 

TCAGTTGCTC AATACGTCTT GTTAGGTCTA TACTCCATTT CCTCTGTGTT AGTT T CTTCA 16380 

ATTTAAACTT AAATCTCCGA ACACTATCTT GATGTGGACG GCTTTTCCAA CCATCTGATA 16440 

ATTTCCAGAA CCCAAAACCT AGATATTTCA ACTCTCTTGG TCATGTTTAC TTTCAAACCT 16500 

AGCCGTTTCT CAATAAACGA CTGACTGAAT ACATC 16535 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8136 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

CCAGAGCGTT GCGTCCGAAA GTCTATCCAG ACACGGCTCT TTAAAAACAA AAGGAGAAAT 60 

GATGCATACT TATTTGCAAA AGAAAATTGA AAATATCAAA ACAACCCTAG GTGAAATGTC 120 
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AGGTGGTTAC 


CGTCGTATGG TTGCGGCTAT GGCTGATTTA GGATTTTCAG GAACTATGAA 


180 


GGCTATCTGG 


GATGACCTCT TTGCCCATCG TAGTTTTGCC CAGTGGATTT ATTTGCTGGT 


240 


TTTAGGAAGT 


TTTCCTCTCT GGCTGGAGTT GGTTTACGAA CATCGTATTG TTGACTGGAT 


300 


TGGGATGATT 


TGTAGCTTGA CAGGGATTAT CTGTGTAATC TTTGTATCGG AAGGTCGAGC 


360 


AAGTAATTAT 


CTTTTTGGCT TGATTAACTC TGTTATTTAC CTTATTTTGG CCCTACAGAA 


420 


AGGCTTTTAT 


GGTGAGGTGC TGACGACACT TTACTTCACA GTCATGCAGC CAATTGGACT 


480 


TCTAGTTTGG 


ATTTATCAGG CACAGTTTAA GAAGGAAAAG CAGGAGTTTG TCGCGCGTAA 


540 


ACTGGACGGC 


AAGGGCTGGA CAAAGTATCT TTCCATTAGT GTGCTTTGGT GGTTGGCCTT 


600 


TGGCTTCATT 


TATCAGTCTA TTGGTGCCAA TCGTCCCTAT CGTGATTCAA TCACAGATGC 


660 


AACCAATGGG 


GTAGGGCAAA TCCTCATGAC AGCTGTTTAC CGTGAACAGT GGATATTCTG 


720 


GGCGGCTACC 


AATGTCTTTT CAATCTATCT CTGGTGGGGA GAAAGCCTGC AAATTCAAGG 


780 


GAAATATCTA 


ATTTATCTCA TTAACAGTCT AGTTGGTTGG TATCAATGGA GCAAGGCAGC 


840 


TAAGCAGAAT 


ACTGATTTAC TTAACTAGGA AAAGATGTTT GAAAGTGCTG TTTTGAGATT 


900 


TCGATTAAAA 


CAGATATAGT TGATAATCAA GGATTTATAG TATGAAAAAG AGGATCGGCG 


960 


GGTCCTCTTT TGTTGTTGAA AAGATAAAAA ACTCAGTAAC CTAGAAATAA GACAACTGAA 


1020 


GCTTTACTCT 


ATATTCAATT TTTAGGAATG AGAAGGTCTA GATAAAATTG GACAACTTCC 


1080 


TGGTCTGTGA 


AATCTTGACC TTTTTTGAGC CACCAGGTCA ATGTCTCGAT AAAGTTGGAC 


1140 


ATGACCAAGT 


GTTGGAGGTA AGAAGTAGGC AGATTAGGGT GGGCTTCTTT TAAATTATCA 


1200 


GCTAGCACGG 


AATAGACATG GTGTTCTAGC TCTTTATGGA GTTGACGGAG GAAGTAGTCA 


1260 


TTTTTGGAAA 


ATAGCAGACT GGTGATATGG TCTTGGTTTT TATGAAAATG GAGAAAGAGG 


1320 


TGGGCGAGGT 


AGTCCTCGGT TGAAATGGCT TGCTCTCTTT CAAAAAGATG ATGGAAGAGG 


1380 


TAGCGGCAGA 


GCTGGTCCAG AAGAAGCTCC TTACTCTCAT AGTGACAGTA AAAGGTGGAT 


1440 


CGTCCCACAT 


CTGCGAGATC AATGATATCC TGAACAGTAG TGGCCTCGTA GCCCTTAGCA 


1500 


TTCAAAAGTT 


GTATAAAAGC TTGATAGATG GCTTTTTTGG TTTTGCTGAT ACGGCGGTCA 


1560 


ATGTTAGTCA 


TATGGACACT TAAGGCAAAT TGTTCAGAAC TGAATAAAGC TGACGTTTTG 


1620 


CTTCTATCCT 


TTCTTTGAGT TTTAGTGGAT AATGATAATG AACAAGGTGT TCATAAATCT 


1680 


ATTATAACAA 


AGGAATGAGA AATATGAAGG CAAAATATGC TGTTTGGGTG GCTTTTTTCT 


1740 


TAAATTTGAC 


TTATGCCATT GTTGAGTTTA TTGCAGGTGG AGTATTTGGT TCTAGCGCTG 


1800 


TTCTTGCTGA 


CTCTGTGCAT GACTTGGGAG ATGCGATTGC AATTGGAATA TCAGCTTTTC 


1860 


TAGAAACAAT 


CTCCAATCGT GAAGAAGACA ATCAGTACAC CTTGGGCTAT AAGCGGTTTA 


1920 
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GCCTGCTAGG AGCCTTGGTA ACAGCTGTGA TTCTCGTAAC GGGCTCTGTT CTAGTCATTT 


1980 


TGGAAAATGT CACGAAGATT 


TTGCATCCGC 


AACCAGTCAA 


TGATGAGGGG 


ATTCTCTGGT 


2040 


TAGGAATTAT 


TGCGATTACT 


ATCAATCTGT 


TAGCGAGTCT 


GGTGGTTGGT 


AAGGGAAAGA 


2100 


CAAAGAATGA GTCTATTCTG AGTCTGCATT TTCTGGAAGA TACGCTAGGG TGGGTAGCTG 


2160 


TTATCCTGAT 


GGCGATTGTT 


CTTCGATTTA 


CGGACTGGTA 


TATCCTAGAT 


CCTCTTTTGT 


2220 


CCCTTGTCAT 


TTCTTTCTTT 


ATTCTTTCAA 


AAGCCCTTCC 


ACGTTTTTGG 


TCTACACTCA 


2280 


AGATTTTCTT 


GGATGCTGTG 


CCAGAAGGTC 


TTGATATCAA 


GCAAGTAAAG 


AGTGGCCTGG 


2340 


AGCGATTGGA CAATGTGGCC 


AGCCTTAATC 


AGCTTAATCT 


CTGGACTATG 


GATGCTTTGG 


2400 


AAAAAAATGC 


CATTGTCCAT 


GTTTGTCTAA 


AAGAAATGGA 


ACATATGGAA 


ACTTGTAAAG 


2460 


AGTCTATTCG 


AATTTTCCTA 


AAAGATTGTG 


GTTTTCAAAA 


TATTACCATT 


GAAATTGATG 


2520 


CTGACCTAGA 


AACTCACCAA 


ACCCATAAGC 


GAAAGGTGTG 


TGACTTGGAA 


CGGAGTTATG 


2580 


AGCATCAACA 


TTAGAAAAAA 


GTGAAAAATA 


CTTGGGTACT 


ATCTTATTTG 


GAATAGAGTA 


2640 


ATTTCTTTAT 


TATTTAAATA 


TTTCAAAAAT 


TGGTAAGAGA 


AGAGCATTGT 


ATAAACTCCA 


2700 


GATATATGAT 


TGTTAATGAT 


AAAAATTTTT 


CGATTAGATA 


CAAAATGCTT 


GACTTGGAGT 


2760 


CAACTCAAAG 


TTATATAATA 


AGATAAGTGA 


GTTAGAATAG 


CGTGAATTCA 


GTGAATGAAA 


2820 


TGAGAGGAGG 


TTAGCGTGTG 


AATATTAAAT 


CTGCCAGTGA 


TTTGTTGGGA 


ATTTCAGCGG 


2880 


ATACGATTCG 


GTATTATGAA 


CGGGTTGGTC 


TTGTGCCACC 


GATTACTCGT 


ACTGCTACTG 


2940 


GGATTCGTGA 


TTTTCAAGAT 


CAGGATATCG 


AAGCGCTGGA ATTTATTAAG .TGTTTTCGTT 


3000 


CGGCGGGTGT 


CTCTGTAGAT 


AGTTTAGTTG 


ACTATATGTC 


GCTCTACCAA 


AAGGGAGATG 


3060 


AAACGAGAGA 


GGAGAGGCTT 


GGTATTTTAG 


AAGAGGAAAA 


GCAAAAATTA 


GAGGAGCGCT 


3120 


TGTCTCAGCT 


ACAGACAGCT 


TTAAATCGTT 


TAAATCTCAA 


AATTAAACTT 


TATAAGGAAG 


3180 


GAAAATTTTA 


AATGAAATCA 


GCAGTATATA 


CAAAGGCAGG 


TCAGGTTGGA 


CTTGCTAGCA 


3240 


TTGAACGTCC 


GCAAATAATA 


GAAGCGGATG 


ATGTGATTAT 


TCGTGTGGTT 


CGTGCGTGCG 


3300 


TTTGTGGTTC 


AGATTTATGG 


AGGTACCGTA 


ATCCAGAAAC 


GAAAGCTGGA 


CACAAAAATA 


3360 


GTGGACACGA 


AGCGATTGGG 


ATTGTTGAAG 


AAGCTGGGGA 


AGCCATTACG 


ACGGTGAAAG 


3420 


CAGGTGATTT 


TGTGATTGTC 


CCTTTTACAC 


ATGGATGTGG 


TGAGTGTGAT 


GCCTGTCTTG 


3480 


CTGGATTTGA 


CGGTTCTTGC 


GACAATCATA 


TTGGCAATAA 


TTTGGGGGGT 


GATTTTCAGG 


3540 


CAGAATATAT 


TCGCTTCCAC 


TATGCAAACT 


GGGCGCTGGT 


TAAAATCGCT 


GGTCAACCTT 


3600 


CTGACTATAC 


AGAAGGGATG 


CTCAAGTCCC 


TTTTGACTCT 


TGCAGATGTC 


ATGCCGAGAG 


3660 
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GCTATCATGC GGCGCGTGTT GCAAATGTTC AAAAAGGGGA CAAGGTTGTT GTTATCGGTG 3720 

ATGGGGCTGT TGGTCAATGT GCTGTCATCG CGGCTAAGAT GCGTGGAGCA TCACAAATTA 3780 

TCCTTATGAG CCGTCATGAA GACCGTCAAA AGATGGCTAT GGAGTCAGGT GCGACAgcTG 3840 

TTGTTGCAGA ACGTGGTCAA GAAGGAATTA CCAAGGTGCG TGAAATCCTC GGTGGAGGAG 3900 

CAGATGCAGC ACTTGAATGT GTTGGTACGG AGGCTGCTAT AGAACAGGCG CTAGGTGTTC 3960 

TTCATAATGG AGGGCGTATG GGCTTTGTAG GAGTCCCACA CTATAATAAT CGTGCTCTTG 4020 

GTTCGACATT TATGCAAAAT ATCTCTGTAG CAGGTGGGGC AGCTTCTGCT ACAACATACG 4080 

ATAAGCAATT TTTACTAAAA GCCGTCCTTG ATGGTGATAT CAATCCAGGT CGCGTCTTTA 4140 

CTTCAAGTTA TAAACTGGAA GATATCGACC AAGCCTATAA AGATATGGAT GAACGTAAGA 4200 

CAATTAAGTC TATGATTGTA ATCGAATAAA AAACGAATAG GAGTTTTAGA ACTCTATTCG 4260 

TTTTTTATGT TATCCTATTC TTGATTTAGG GTACTTTCTC TTAATGTCAG TCTGGTTCCC 4320 

AGCATGGTCA GGCTAGGGAT TTTCCGACCG TGGAGGACTT XCTTGTTAAG AATATCCATA 4380 

CCTGCTCGGC CCATTTCTTC AGTATAAACT GTAATACTAG AGAGGGGAGG ATAGACCTGT 4440 

TTGGTCAGAC TAGTGTCGTT AAAGGAAATG AGGCTGACGC GATCTGGCAG GCTGATTCCA 4500 

GCTTCTTGGA GGGCACGGAG GGCACCGATA GCTAAACTAT CGCTGGCTGC GAAAAATGCT 4560 

GGCGGAAGTT GGTCTCCCAA GCTCTGAATG GCCTCCTTCA TTAAGTCATA GCCAGACTGG 4 620 

GCAGTAAATC TTCCTTGAAA GACCAGTTCA TCATGATAGA TTCCCCTCGC TTGACTATAG 4680 

TTTTTGAAGT TTTCTAGACG CTTGTCCTGA ATGATTTCTT CTTGGTCTGT TGTTTCTTCA 4740 

AGGCCTGTTA GAATCCCGAT ACGGTCCATT CCTTGACTGA GGAAATAATC GACAACCTGT 4800 

TTCATAGCAG TGTAAAAATC CGTGATAATA CAGGTATGTC CCAGGGAAAG TGTATCGCTG 4860 

TCTAGAAATA CAAGAGGCTT TTGGTATTCT TCAAAGGCAG AAATCTGAGC TCGACTAAAC 4920 

TTTCCGATGC AGAGAATCCC AATCACTTCC TCGCTTAGGG TAAAAGGGTG GTCATTAAAA 4980 

TAGCGCAAGA TATCATAGTC CAACTCTTGG GCTCTTTTTT CTATTCCTAG GCGAATCTGG 5040 

TAGTAGTAGA GGTCGTCCAG CTCCCCTTGT TCGCTGACCC ATTGGATAAT GGCAATCTTT 5100 

TGCTTGGGTT TGTGGGACTC GCCTGTCTTG AGGTGCTTGG TGTAGCCCAG CTCTTCAGCA 5160 

ACGGTTAAAA TACGGTGTCT GGTTTCTTCT GTAACAGATA GGCTCTGGTC GCGGTTGAGG 5220 

ACGCGGGATA CGGTCGCGAT AGAGACAGAG GCTAGCTGTG CAATGTCTTT TAAGGTAGCC 5280 

ATAAATCCTC CTTGATTAGG TTAGTATATC ATGTTTTTCT TCTTTTTACT GATATTTTAC 5340 

TAAAATTTTA GTAAAAAGGA TTGACCTTGG AAAATTCCTT GGATATAATA GAAAGAAAAC 5400 

GATTACACGT TAAGATGGCT TAACGGACAG TCAAAGGAGA ATTCATATGG CACAACATCT 5460 
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TACTACTGAA 


GCCCTTCGCA 


AAGACTTTCT 


TGCTGTTTTT 


GGTCAAGAAG 


CAGATCAAAC 


5520 


CTTCTTTTCA 


CCAGGCCGCA 


TTAATTTGAT 


TGGTGAACAC 


ACAGACTACA 


ACGGTGGGCA 


5580 


CGTTTTTCCT 


GCTGCTATTT 


CCTTGGGAAC 


TTACGGTGCA 


GCTCGTAAGC 


GTGACGACCA 


5640 


AGTCTTGCGT 


TTCTACTCAG 


CTAACTTTGA 


GGACAAGGGC 


ATTATCGAAG 


TGCCTCTCGC 


5700 


TGACCTCAAG 


TTTGAAAAAG 


AGCACAACTG 


GACCAATTAT 


CCAAAAGGTG 


TCCTTCATTT 


5760 


CTTGCAAGAA 


GCTGGGCACG 


TGATTGACAA 


AGGTTTTGAT 


TTTTATGTTT 


ATGGAAATAT 


5820 


TCCAAATGGT 


GCTGGCTTGT 


CTTCTTCTGC 


ATCCTTGGAA 


CTCTTGACAG 


GAGTCGTGGG 


5880 


TGAGCATCTC 


TTTGATTTAA 


AATTAGAGCG 


TCTCGATTTG 


GTTAAAATCG 


GCAAACAAAC 


5940 


AGAAAACAAC 


TTTATCGGAG 


TAAACTCTGG 


CATTATGGAC 


CAGTTTGCTA 


TTGGTATGGG 


6000 


GGCAGACCAA 


CGTGCTATTT 


ACCTAGATAC 


TAATACTTTA 


GAATACGACT 


TGGTGCCACT 


6060 


TGATTTGAAG 


GACAATGTCG 


TTGTTATCAT 


GAACACCAAC 


AAACGCCGTG 


AATTGGCGGA 


6120 


CTCTAAATAC 


AATGAACGTC 


GTGCTGAGTG 


TGAAAAAGCA 


GTGGAAGAAT 


TGCAAGTTTC 


6180 


CTTGGATATT 


CAGACTCTGG 


GTGAATTGGA 


CGAGTGGGCC 


GTTGACCAAT 


ATAGCTATCT 


6240 


GATTAAAGAT 


GAAAATCGTT 


TGAAACGTGC 


TCGCCATGCT 


GTGCTTGAAA 


ACCAACGTAC 


6300 


CCTCAAAGCT 


CAAGTAGCAC 


TCCAAGCAGG 


AGATTTGGAA 


ACATTTGGAC 


GCTTGATGAA 


6360 


TGCGTGACAC 


GTTTCTCTGG 


AGCATGATTA 


TGAAGTAACT 


GGTTTGGAAT 


TGGATACCCT 


6420 


TGTTCACACA 


GCTTGGGCAC 


AAGAAGGAGT 


TCTCGGTGCT 


CGTATGACAG 


GGGCTGGTTT 


6480 


TGGTGGCTGT 


GCcATTGCCT 


TGGTTCAAAA 


AGATACTGTT 


GAGGCCTTTA 


AGGAAGCTGT 


6540 


AGGCAAACAC 


TACGAGGAAG 


TAGTTGGATA 


CGCTCCAAGC 


TTCTATATCG 


CTGAAGTTGC 


6600 


AGGTGGCACT 


CGCGTCCTTG 


ACTAGTCAAA 


AGGAGGCTCT 


ATAGTGACCT 


TAGTAAATAA 


6660 


ATTTGTAACA 


CATGTCATTT 


CTGAAAGCTC 


ATTTGAGGAA 


ATGGATCGAA 


TCTATCTGAC 


6720 


CAATCGTGTT 


TTGGCACGAG 


TGGGAGAAGG 


TGTTTTGGAA 


GTTGAGACCA 


ATCTGGATAA 


6780 


ATTGATTGAC 


CTCAAGGACC 


AGCTGGTTGA 


AGAAGCCGTT 


CGATTAGAGA 


CGATTGAGGA 


6840 


TAGTCAGACT 


GCGCGTGAAA 


TCCTTGGTGC 


TGAACTGATG 


GATTTGGTGA 


CTCCTTGTCC 


6900 


AAGTCAGGTC 


AATCGTGATT 


TTTGGGCAAC 


CTACGCCCAC 


TCTCCAGAAC 


AAGCGATAGA 


6960 


GGATTTTTAC 


CAACTCAGTC 


AGAAAAATGA 


CTACATCAAA 


CTCAAGGCCA 


TTGCTAGAAA 


7020 


TATCGCTTAT 


CGTGTTCCAT 


CTGACTACGG 


AGAACTTGAA 


ATTACCATCA 


ATCTCTCTAA 


7080 


GCCTGAAAAA 


GATCCCAAAG 


AGATTGTGGC 


AGCCAAGTTG 


GTGCAAGCTA 


GTAATTATCC 


7140 


TCAGTGTCAG 


CTTTGTCTAG 


AGAATGAGGG 


CTACCATGGT 


CGAGTTAACC ACCCAGCTCG 


7200 
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TAGCAATCAC CGTATTATCC GTTTTGAAAT GGTTGGTCAG GAATGGGGTT TCCAGTATTC 7260 

GCCCTATGCT TACTTTAATG AGCATTGTAT CTTTTTAGAT GGCCAGCATC GTCCCATGGC 7320 

CATTAGTCGT CAGAGTTTTG AACGTCTGTT GGCTATCGTA GACCAGTTTC CAGGATATTT 7380 

TGCTGGATCT AATGCCGACC TGCCGATTGT GGGGGGCTCT ATTCTAACTC ATGATCATTA 7440 

TCAGGGAGGC CGTCACGTAT TTCCTATGGA ATTGGCTCCC TTGCAAAAGG CCTTCCGATT '7500 

TGCTGGTTTT GAGCAGGTCA AGGCTGGAAT TGTCAAGTGG CCCATGTCTG TCCTACGTTT 7560 

GACTTCGGAT TCCAAAGAGG ATTTGATCAA TTTGGCTGAT AAGATTTTGC AGGAATGGCG 7620 

CCAGTATTCA GATCCTGCAG TGCAGATTTT GGCAGAGACA GACAGGACAC CGCATCACAC 7680 

TATCACACCC ATTGCCCGCA AACGCGATGG ACAGTTTGAG TTGGACTTGG TCTTGCGAGA 7740 

CAATCAGACT TCAGCAGAGT ATCCTGATGG TATCTATCAT CCCCACAAGG ATGTCCAACA 7800 

TATCAAGAAG GAAAATATCG GCTTGATTGA GGTCATGGGC TTGGCAATCT TGCCACCACG 7860 

TCTGAAAGAA GAAGTGGAGC AAGTCGCTAG CTATCTTGTA GGAGAAGCTG TTACAGTTGC 7920 

CGATTATCAT CAGGAGTGGG CAGACCAACT CAAATCCCAA CATCCAGACT AACGGATAAA 7980 

GAAAAAGCCC TTGCAATCGT CAAGGACTCT GTGGGTGCTA TCTTTGCGCG TGTACTTGAG 8040 

GATGCAGGAG TCTACAAGCA GACAGAACAA GGGCAGACAG CCTTTATGCG CTTTGTGGAA 8100 

CAGGTCGGAA TTTTACTAGA CTAGGAGCTT TCTCGG 8136 
(2) INFORMATION FOR SEQ ID NO: 76: 

(x) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10011 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

CCCATAGTGA AGAGTGGCCA TAAGAAGGTC TTCTAGGCTT AATTTAGGTT TTCGTCCACC 60 

TTTTGCGTGT TTAAGTTGAT AAGCTGTTTT TAACACAGCT GAACATCTCT TCAAAAGTCG 120 

TGCGCTGAAC ACCAACAAGA CATTTAAATC GTGTATCAGT TAGTTGTTTA CTTGCTTCAT 180 

CATTCATAGA ACTACTATAC CATGTTTTGT TTCGCAGGAA GTCTAATATT GTCAAATACT 240 

GGAACGCTCA TTGCTGGGAT ACGGAATAAG ATTGGCCCAG CTTCGATAAC TGGGATACCT 300 

GGTTCAAAAC CAAGGTCTGT TGCAGCGATT GGTGTAAAGA TATCGTAACC TTTCATAAGG 360 

TCTTCGTTTA CATCTTTCAC CATAACTGCA TCACAGTGAA CATCGTAACC ACGGTTTGAA 420 

AGTTCTTCTT CTAGAGCACT TTTAATTTGG TGACTTGAGT TAACACCTGC ACCGCAGGCA 480 
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GCAAGAATTT TAATCATTTG GATTTCCTCC GATTTTATTT TTTAATAGAC AAGATTAAGC 540 

GGTTGCTTCA GCAATGTAAG CATAAAGGGC TTCTGGTTCA GAAATTTTTG ATAGGTCTTC 600 

AAGATGACCA TTTCCTGTGA AGAAGTCCAT TAACTGAGCA AGAATGTTCG TTTGACTTGA 660 

ACTTGAATTA TTGATGATAA AGAAGAGCAA GGATACTTCT AGTTCCTTAC CTGGCGCAAT 720 

CATATTATGG AAAGTCACCG GTTTCTCTAA TCGAACAACC ACCACTTTCT CAGCTAGATT 780 

ATGAACAATA TCTGTGTGAG GAATCATTAC ATTTGCAAGT CCTTTCCTAG AAATTCCATA 840 

TATAAACCAG TTGGAAATGA CTTTTCACGC GTGATCAAGG CTTCACGATA AGTTGGAGTG 900 

ACAATTTCTC GTTCTTCCAA CAAGCTTGCT ACCTGATCAA AAAGTTATTC TTGATTATCC 960 

GCTTCTAAGC AAAACACAAG GTTTTTGTCA AAGAAATAAT CTAATACCAT AAGGTTTTCC 1020 

CTTCTTTCCA TTAACTTTAT GCTATAAGTA TAACACTATA TGAAATCGTT GTTAATTACT 1080 

TTCTATTCTT TTTTGTCTCT TTTTTTATAT TTTTGTTTTG TTTATAGTTT GTTATATAAA 1140 

AATAAACACA CAAACAAATA CTCCAAGCAT TTTTCTGTTC TAATACTCAA TGAAAATCAA 1200 

AGAGCAAACT AGGAAGCTAG CCGCAGTTGT TCAAAACACA GTTTTGAGGT TGTAGATGAA 1260 

ACTGACGAAG TCACTCAAAA CATGGTTTTG AGGTTGTAGA TGAAACTGAC GAAGCAACAg 1320 

CCATACATAC GGTAAGGCGA CGCTGACGTG GTTTGAAGAG ATTTTCGAAG AGTATAAAAA 1380 

CTAAAAAAGC AGACCATCTA AGCCTGCTTT ACTATTGATT GTTATATAAA TTTCCTGTGA 1440 

ACAAGGAAAG GCATTTCTGA TAACTTATTC TTCATCCATA CTCAAGACGC TGAGGAAGGC 1500 

TTCTTGCGGA ACTTCAACTG ATCCGATGGA TTTCATGCGT TTCTTACCAG CTTTTTGTTT 1560 

TTCAAGGAGT TTACGCTTAC GAGAAACGTC ACCACCATAA CATTTAGCAA GTACGTTCTT 1620 

ACGAAGGGCC TTGATATCAG TACGAGCGAC AATCTTGTGT CCAATAGCCG CTTGGATTGG 1680 

AACTTCAAAT TGTTGGCGAG GGATGATTTT CTTGAGTTTA TCAACGATGA GTTTCCCACG 1740 

TTCGTAGGCA AAGTCCTTGT GAACGATAAA GCTGAGGGCA TCCACCTTAT CTCCATTGAG 1800 

AAGAATATCC ATTTTCACCA GCTTAGATGG GCGATATTCT GACAATTCGT AGTCAAAGCT 1860 

TGCATAACCA CGTGTCGAAG ACTTAAGTTT ATCAAAGAAG TCAAAGACAA TTTCAGCAAG 1920 

AGGAATTTGA TAGATAACAT TGACACGGTT ATCATCAATA TAGTCCATAG TCACAAAGTC 1980 

CCCACGCTTA CGCTGAGCTA GCTCCATTAC TGCTCCGACG AACTCCTGTG GTACCATGAT 2040 

TTGCGCCTTG ACATAAGGCT CTTCAATGGT CGCAATCTTA GTTGGGTCTG GAAACTCAGA 2100 

TGGGTTAGAC ACATCCATAG ACTCACCGTC GGTCAAATTA ACTTTGTAAA TAACAGACGG 2160 

AGCTGTCATG ATGAGGTCAA TATTGAACTC ACGCTCTAAA CGTTCCTGGA TAACATCCAT 2220 
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ATGGAGAAGT CCAAGAAATC CACAACGGAA ACCAAATCCA AGTGCCTGAG ATGTTTCTGG 2280 

TTCAAACTGA AGACTAGCAT CATTCAGTTG CAATTTTTCA AGcGCTTCAC GCAGGTCATT 2340 

GTACTTGTTT GATTCGATTG GGTAGAGACC CGCAAAGACC ATAGGATTCA TCTGCTTATA 2400 

ACCATGTAAT GGTTCTGCCG CAGGATTGGT TGCCAAGGTA ACGGTATCAC CCACACGAGT 2460 

ATCCTGAACC GTCTTGATAG ACGCCGCAAT GTAACCAACA TCACCAGTCG CAAGGAAATC 2520 

ACGACCAACC GCTTTTGGTG TAAAAATACC GACTTCGGCC ACATCAAAGG TCTTACTATT 2580 

GCTCATGAGC TGAATCTTAT CACCAGGTTT GACCACTCCG TCCATGACAC GCACTTGGAG 2640 

GATAACCCCA CGGTAAGCAT CGTAAACAGA GTCGAAAATC AAGGCCTTAA GTGGCGCCGT 2700 

CACATCACCC GTTGGTGCTG GTACTTTTTC TACAATTTGC TCGAGGATTT CTTCAATCCC 27 60 

AATACCAGCC TTGGCAGAAG CCAAAACTGC TTCACTGGCA TCCAAACCAA TCACATCTTC 2820 

AATCTCTGTA CGCACGCGCT CCGGATCTGC AGCCGGCAGG TCAATTTTAT TAATGATAGG 2880 

CATGATTTCC AAATCATTAT CCAAAGCCAG ATAAACGTTG GCAAGAGTTT GAGCCTCAAT 2940 

TCCTTGAGCC GCATCGACCA CCAAAATAGC ACCCTCACAG GCAGCTAGCG AACGTGAAAC 3 000 

TTCATAGGTA AAGTCAACGT GCCCTGGTGT GTCAATCAAG TGGAAAATAT AAGTTTCCCC 3060 

ATCTTTTGCA GTGTAATTCA ACTCGATGGC ATTCAACTTA ATAGTAATTC CACGTTCCCG 3120 

CTCTAGCTCC ATGCTATCCA AAAGCTGGGC CTGCATTTCA CGACTTGAAA CCGTCTCTGT 3180 

TTTTTCCAAA ATGCGGTCTG CTAGAGTTGA TTTTCCGTGG TCAATATGGG CGATAATAGA 3240 

GAAGTTACGG ATCTTCTCCT GTCGTTTTTT CAATTCTTCT AAGTTCATGA TTCTCTTCCT 3300 

TTCAGGGTAT CTATTTATTA TAAATTGTTT TTGATATTTT GACAAGACCA TACCCTGCTA 3360 

GGAGTACTAA TCTTCAGCGA CAAAGCCGTC ATTTTCGATA AAGTGGTGTT CTGTCATTCC 3420 

TTGGTCTGTA AAGACAATCC CGTGAAGGAC ACCACCATAA ACAGCTCCTC CATCGATTCC 3480 

AATCTTGCCA TCTTCTGTAG TCCAAAGCTC AGATGTACCG CGTTCTTGCT GTAACAAACC 3540 

ATAGACCGGT GTATGACCGA AGACAATGGT TTTTCCAGTA TGATTTTCAG CTCCGTGGAA 3600 

TGGTTTTCTA AGCCATACTT TTTTATAATC TGTTGTTTCA TGCCAGTCGT CCAAGGTCAA 3660 

ATCAATACCT GCGTGAACAA AGATATACTT GTCTGTCTCT ACTACAAATG GCATTTGACG 3720 

AATGAATTCG ACCAAGTCTG CCGCTTCAGC GgCAACCGGC TTGGCATCTT CTACTCCATC 3780 

AACTGGTGCA TCCAAGGGAC GACCTAGGAT AGAGTTAATG GTTGTATCTC CACCATTGCG 3840 

ACTATAATGG TCATAACTTT CTTCTGGGTC ATCTAGCCAA GTCAAAAACA TATACTCGTG 3900 

GTTTCCGGAC AAACAGATAG CCCCTTGATT GTCCACCAAG TCCTTGACCA TTTCAAGAAC 3960 

ACGGTGACTA TCCTCACCTC TGTCAATCAA ATCACCTAGA AAGAGCAACT GGGGCTGACC 4020 
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ATCCCAGGTT TTGAGAAGGT CTTCCAGCAT CCCAGCTTTT CCGTGAACAT CTCCAATTAC 40 BO 

ATAATAATCT GTCATCTTAT TTCTCCCTGT TTCTCAACAA TTCTCTTGCT TGCGTCAGGG 4140 

CTGCTTCTGT CACATCATCA CCTGCCAACA TCTTGGCAAC TTCCTCCACT CGCTCTTCGA 4200 

CCGTCAAGAG ACGAACAGTC GAAACCGTTG AATGGTCATT ACTAATCTTC TCAATAAAGA 4260 

ATTGATAATC TGCAATCGCA ATTACTTGTG GCAAATGGGA GATAGCCAAA ACCTGACCAT 4320 

GCTGACCAAT TTTATGAATT TTCTGAGCAA TAGCTTGAGC AACACGACCT GAAACTCCCG 4380 

TATCCACCTC ATCAAAGACA ATGCTAGTCT TGCCTTCTTT ACGTGAAAAG GCAGACTTAA 4440 

TGGCTAACAT GAGACGAGAT AATTCCCCTC CAGAAGCAAC CTTAACCAAG GGTTTAAAGT 4500 

CTTCTCCAGG GTTGGTTGAA ATATAAAACT CAACCATTTT ATTTCCCTCA CGACTGAATT 4560 

TTCCCTTACT AAAACGAACC TGAAACTGGG CTTTTTCCAT ATAAAGATCT TGCAGTTCTT 4620 

GTTTAATCTC AGCTTCGAGT TGCTGAGCCA AATTATGACG AGCAGAAGCA AGTTGACCTG 4680 

CCAAATTGAC AAGATTGACT TCCAACTTCT TAAGCTCTGC TTCCATGTCC TCAGACGAAA 4740 

GATTATTGCC TGTCAAGAGA TTGTATTCTT CCGTAATCTT GGCAAAATAA AGCAAAACAT 4800 

CATCAACAGT CCCACCATAC TTACGAGTAA TAGTATGAAG GAGGTCCAAA CGATTCTCAA 4860 

CCTGCATCAG GCGATTGCCA TCAAAATCAA GGTCCTCAAT GATAGCTTCC AAACGTTTGC 4920 

TAATGTCTTC TAAAACATAG TAGGTCTCAG ACAGATAGCT TGAAATTTCA CGGTATTCAG 4980 

GATCATACTC TTCGACACTT TCCATGTCAT TCATAGCTGA ACGAACATTG GCCAGACTTG 5040 

AAAAATCTTC ATTGTCCAAC ATACTGTAGG CATTGGTCAG TGTATCCGCA ATATTTTTGT 5100 

GGTTGAGGAG TTTATCTCGC TCTTGATTGA GAGCCAAGTC TTCTCCAGCC TGCAAGTTTG 5160 

CTGCCTCAAT CTCTGCCATT TGAAATTCCA ACATTTCGAT ACGTGCCTTG TGTTCCTGTT 5220 

GGTTTTTCTT GACTTCCAGA ACCTGCTTGC GCATTTTCCG ATAGGCATCA AAACTCGTTT 5280 

GATAGGTTTC TTTCAAGTCC CAAAAAGCGG CATCACCAAA TTCATCCAAC ATCTGGATAT 5340 

GCAGTTGGGG ACGCATTAAC TCCTCATGGT CATGCTGACC ATGAATATCT ACAAGATGTT 5400 

GCCCAATAGC TCGCAAAACA GACAGATTAA CCATCTGACC ATTTACACGG CTGATACTAC 5460 

GACCATTTTG CAAGATTTCC CGACGGATGA TAATTTCATC ACCTAATTCT AAACCTTGCT 5520 

CATCAAAAAT TTCCTGTAAA AGACGACTAT TCTCAACTGA GAAAAGCCCC TCAATCTCTG 5580 

CCTTTGGTGC ACCATGACGA ATAACATCTG TCGTCGCACG AGCTCCCAAC ATCATATTCA 5640 

TGGCATCAAT GATAATCGAC TTCCCTGCAC CCGTTTGACC AGTCAGGACA GTCATCCCCT 5700 

TTTCAAAATT GAGGGAAATA GCCTCAATAA TGGCAAAGTT TTTTATCGAA ATTTCAAGTA 5760 
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A CAT AT AG AC CTACCAATTT TTTACTTGTT CAAAGATTTC CTCTGCTAGA CTTCCACTTC 


5820 


TGGCAATGAC TAAAATCGAG CTATCATCAG TCAAACAGCT 


AAAAATCTTG 


TCTGCAAAAG 


5880 


TCTCGATTAA CTGAGCTTTT ACAAAAGCCG TATTTCCTGG 


AATAACTTGG 


AGATTGATCA 


5940 


TCTTATCCAT CAATTCAGCC GATTCGATAT TGTCTTCAGC 


CAGTTGCAGA 


CTTTTTACGA 


6000 


TTGATTTTGG CAATTCGTAG ACATAGGTGT TGTCTCTCAA 


AGGAATTTTG 


ACAATACCTA 


6060 


ACTCTTTGAT ATCTCGGGAT ACCGTCGCCT GAGTGGCAGT 


GATACCTGCT 


TCTTTCAAAT 


6120 


GTTCTACAAT TTCTTCTTGC GTGCCGATTT GATAATCTGT 


CACCAATCTT 


CTAATTTTTT 


6180 


CAAGTCTCTC TTTTTTATTC ATTTTTAAAT TGACTATGCG 


CCCTCTCTAC 


TGCTTCTTTA ' 


6240 


ATCTCAGCAA GAATCTGATT GCTTGCTGAC TTTTCTTTTT 


TCAAATACGC 


TAAAAATTCA 


6300 


ATATTTCCAT GTCCACCTTG GATGGGAGAA AAGTCCAAGC 


CAAGGACTGA 


AAAAGCTACG 


6360 


TCTACTGCCA TAGCTGTTAC AGATTCAAGG ACATTCTGAT 


GAACCTTAGC 


ATCTCGAATA 


6420 


ATTCCATTTT TCCCAATCTG CTCACGTCCT GCCTCAAACT 


GAGGTTTGAC 


AAGTGCTACC 


6480 


ACCTGACCTT GATCAGCCAA GACACGGTGC AAGGCTGGCA 


AAATCAGACT 


AAGGGAAATG 


6540 


AAACTCACAT CAATACTGGC AAAGCTCGGC TCCTGCTCGA 


AATCAGTCTT 


TTCAGCATAG 


6600 


CGGAAATTGA ACTGCTCCAT GCTGACAACT CGTGGGTCTT 


GGCGTAATTT 


CCAAGCCAAC 


6660 


TGATTGGTAC CAACATCGAC TGCAAAGACC AACTTGGCAC 


TATTCTGTAG 


CATGACATCG 


6720 


GTAAAACCTC CAGTAGAGGC CCCGATATCA ATCGTAGTCG 


CGCCATCCAC 


CGACAAATCA 


6780 


AAGACCTGCA AGGCCTTTTC CAGTTTCAAA CCACCACGGC 


TGACATACTT 


GAGTTTCTCC 


6840 


CCCTTGAGTT TTAATTCGGT GTCATCTGGA ATTTTCTCTC 


CTGGCTTGTC 


AAACCGTTCT 


6900 


CCATTAAGGA CTGCTACGAC TAGGCCAGCC ATCACACCTC 


GCTTGGCCTG 


CTCTCTCGTT 


6960 


TCAAACAACC CCTGTTTATA AGCTAGTACA TCCACTCTTT 


CCTTAGCCAT 


TGATTCTCAA 


7020 


ACTTTCTACT ACACTTACAA TCGATTCTGT TTCAAAGGGA 


AGCTGCTGGG 


CAATTTGTTC 


7080 


TAATTTTTCA TTAGCTTGAT CCAGGGTTTG GTTACAAAAG 


GCAATGGACT 


CTTCCAAGCC 


7140 


CAACAGGGCA GGATAGGTTG ATTTTTCTGC CTGCAGATCC 


TTTTGAGGTG 


TCTTGCCGAT 


7200 


TTCCTCAAAA CTAGCTGTCA CATCCAGTAC ATCATCTCTG 


ACTTGAAAAG 


CAAGTCCAAT 


7260 


CAATTCACCC ACAGTTTTCA GCTTCACCTG CATTTCAGGT GACAATTCAG 


CTATAATAGC 


7320 


TGCCGCTTGG AAGGGATAGG CTAGTAACTT CCCAGTCTTA TTGGCATGAA TAGTCTGAAG 


7380 


TTCTTCCAAA GACAAGTGCT GGTGTTCGCC CTCCATATCC 


AAAACTTGCC 


CTGCTACCAT 


7440 


ACCCAGACTA CCTGAAGCAA GGGATAAGTT GGCAATCAAG 


TCCACCTTAA 


TCTGACTTGG 


7500 


CAAATCTGCC TGCGCAATCA AGGCATATGA GTCTAAGAAT 


AAGGCATCTC 


CAGCCAAAAT 


7560 
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GGCCATAGCT 


TCACCGAATT 


TCTTGTGATT 


GGTTAACCGC 


CCTCTTCGAT 


AATCGTCATC 


7620 


ATCCATAGCA GGAAGGTCAT CGTGAATCAA GCTCCCTGTA TGAATCATCT CTAAGGCAGT 


7680' 


AGCTACCTGG 


GCGTGAGCAG 


GTTTGATGGT 


AACCTGCAAG 


GCTTCCAGAA 


CTTCTAACAA 


7740 


GAGAAAAGGC 


CGAATACGCT 


TGCCACCAGC 


ATGAATAGAA 


TAGAGAACAG 


ACTCCCGTAA 


7800 


ACTAGAGGCA 


AACTGCTGGT 


CTCCATAAAA 


ATCTTCCAAA 


GCCGACTCGA 


CAAGAGCTAA 


7860 


TTTTTCTTGC 


TTTTTCATTC 


AAAATCACTT 


TCTGTTCCGT 


CTTCTTGCAT 


GACCTTGACC 


7920 


AAGGTCTTTT 


CAGCCTTGTC 


CAGCGTAGCT 


TGGAGCTCTT 


TTGACAAGAC 


CATGCCCTTT 


7980 


TGAAAGGCAG 


TAATCGCATC 


TTCCAGAGCA 


ATTTCACCAT 


TTTCCAAACT 


TTGGACAATG 


8040 


GTTTCCAGTT 


CTGCTAGATT 


TTCCTCAAAT 


TTCTTTTGTT 


TTGACATCTT 


TAACCTCTAA 


8100 


TTCTACTTGA 


CCATCTCGCA 


TCAAAAGCGT 


TACTTGGTCT TTTTTCTTCA AACTCTCAAC 


8160 


CGAATCTACA 


ACGGACTCTT 


CTTTTTTGAC 


AATAGCATAA 


CCACGCGCCA 


CGATTCGGCT 


6220 


AGTATCCAAC 


ATGAGCAAAG 


CTTCCGAAAG 


TCGCTTGGCC 


TCAGCAACCT 


TGGCGTCATA 


8280 


AACTAACGCC 


ATTTGGCTAC 


CTAAGAGCTT 


GTCCAACTGT 


CCTAAACGGT 


CTTGATAGCG 


8340 


TTGGATTTTG 


GTAACAGGTG 


ATAATTGTAC 


TAATTGATGA 


GTTCTTGCTT 


GAACTAATTG 


8400 


TTTGTTATCA GAAATCCGAG 


TTCGCAAACT 


TTGTTTCAAA 


CGCAGTTGCA 


GTTGGTCCAA 


8460 


GCGTTGCAAA 


TAACCGTCAT 


ACAAGCGCTC 


AGGTTGTCTA 


AAGATAACAG 


ACTGACTGCA 


8520 


TTTTTTCAAA 


GCCTCTTGTT 


TCTTAGATAG 


AACATTTCGG 


ACTGCCGTTA 


CCATCCGTTT 


8580 


TTCCTGATTT 


TGCAAATGAG 


CTAATACATC 


CAACTTGGTC 


ACAGGTGTTG 


CCAGTTCAGC 


8640 


CGCCGCTGTT 


GGCGTTGCAG 


CGCGTCGATC 


TGCCACAAAA 


TCTGCCAAGG 


TCACATCCGT 


8700 


CTCATGCCCC 


ACACTAGAGA 


TAACTGGCAA 


ACGAGATTCA AAAATAGCTC 


GTACCACAAT 


8760 


TTCTTCGTTA AAGGCCCAGA GATCCTCAAT AGAACCACCT CCACGACCAA TAATGAGCAA 


8820 


ATCCAAATCG 


TCCCGTTGAT 


TAGCACGCGC 


AATATTTCTA 


GCAATTTCCT 


CCGCAGCCCC 


8880 


TTCACCTTGA ACCTTGGTCG 


GATAAAGAAG 


GATGTCAACA 


CCTGGGAATC 


GCCTGCTGAC 


8940 


GGTCGTGATA 


ATATCTCGAA 


TAACGGCTCC 


ACTACGGCTG 


GTTACTACAC 


CAATTCTCTT 


9000 


AGAAAATTGG 


GGCAGAGCTT 


GCTTGAAGCG 


TTCTTGAAAC 


AGGCCTTCTT 


CTGTCAATTT 


9060 


TTTCTTAAGT 


TGTTCAAACT 


GAATCGCAAG 


CGCCCCAACC 


CCATCAGGCT 


CAGCTTTTTC 


9120 


AATGATGATG 


GAGTAGCTAC 


CACTTGGTTC 


ATAGACCTGT 


ACACGCCCAA 


TCACATTGAT 


9180 


CTTCATTCCT 


TCTTCCAGGT 


CAAACCCTAA 


TTTCTGATAA 


ATCCCAGACC 


AGATGGTCGC 


9240 


TTGAATAACT 


GCATGGTCAT 


CCTTTAGGGA 


GAAATATTGG 


TGAGTAGGTG 


GTTTACGAAA 


9300 
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GTTGGAAACT TGACCAGTTA AATAGACCCG TTCCAAGTAT GGGTCTTTAT CGAATTTCAT 
TTTCAGATAC TTGGTCAAAG TTGTTACCGA TAAATACTTT TCCATCTCCA CCTACTATTC 
ATTTACTTGC TCTTTCATGG GTATTATTAT ACCAAAAATA TGCCTAAAAA TCTCCATTTA 
TGTACCATTA TGAGGGAAAA ATAGAAAAAG GAGGCAAGGC CTCCACATGT GATTATTTGC 
TGTTTCGAGC TTCTTCCAAA ATCTTTGCAA TCTTGGTCGT CAACAGGTCG ATAGCCACGG 
TATTGCTAAC CCCTTCAGGA ATGACGATAT CAGCATAACG CTTAGTTGAC TCGATAAACT 
GGTGGTACAT TGGTTTGACC ACACCTAAGT ACTGGTTAAT AACGCTATCA AGGCTACGGC 
CACGCTCCTC CATATCACGC TTGATACGAC GAATAATGCG CACATCGTCA TCCGTATCCA 
CAAAAATCTT GATATCCATC AAATCGCGCA GACGCTTGTC CTCCAAGACC AAAATACCCT 
CAACGATAAA GACATCTTGA GGTTCCTGAC GATAGGTCTT GCTACTCCGT GTATGCTCTG 
TATAGTCGTA GGTCGGGATG TCCACCGGAC GCCCTGCCAA CAATTCCTTA ATCTGCTCGA 
TCATCAAGTC TGTATCAAAG GCAAAAGGAT GGTCATAGTT GGTTTTGACG G 
(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5365 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10011 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

CGTGTGGTCT TAAAAATAGA AGACAAAGAA CAAACTGTTG GAGGCTTTGT CCTTGCAGGC 60 

TCAGCCCAAG AAAAAACCAA AACAGCTCAA GTTGTGGCTA CTGGACAAGG TGTTCGTACC 120 

TTGAACGGTG ACTTGGTTGC TCCAAGTGTT AAAACTGGAG ATCGTGTCTT AGTTGAAGCC 180 

CACGCAGGTC TTGATGTCAA AGATGGCGAT GAAAAGTACA TCATCGTAGG CGAcTAACAT 240 

TTTGGCAATC ATTGAGGAAT AGAAGGAGAA AGTAAGTATG TCAAAAGAAA TTAAATTTTC 300 

ATCAGATGCC CGTTCAGCCA TGGTTCGTGG TGTCGATATC CTTGCAGACA CTGTTAAAGT 360 

AACCTTGGGA CCAAAAGGTC GCAATGTCGT TCTTGAAAAG TCATTCGGTT CACCCTTGAT 420 

TACCAATGAC GGTGTGACCA TTGCCAAAGA AATCGAATTG GAAGACCATT TTGAAAATAT 480 

GGGTGCTAAG TTAGTATCAG AAGTAGCTTC TAAAACCAAT GATATCGCAG GTGACGGAAC 540 

TACGACTGCA ACAGTCTTGA CCCAAGCTAT CGTCCGTGAA GGAATCAAAA ACGTCACAGC 600 

AGGTGCAAAT CCAATCGGTA TTCGTCGTGG GATTGAAACA GCAGTTGCCG CAGCAGTTGA 660 

AGCTTTGAAA AACAACGCCA TCCCTGTTGC CAATAAAGAA GCTATCGCTC AAGTTGCAGC 720 
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CGTATCTTCT 


CGTTCTGAAA AAGTTGGTGA GTACATCTCT GAAGCAATGG AAAAAGTTGG 


780 


CAAAGACGGT 


GTCATCACCA 


TCGAAGAGTC 


ACGTGGTATG 


GAAACAGAGC 


TTGAAGTCGT 


840 


AGAAGGAATG 


CAGTTTGACC 


GTGGTTACCT TTCACAGTAC 


ATGGTGACAG 


ATAGCGAAAA 


900 


AATGGTGGCT 


GACCTTGAAA 


ATCCGTACAT 


TTTGATTACA 


GACAAGAAAA 


TTTCCAATAT 


960 


CCAAGAAATC 


TTGCCACTTT 


TGGAAAGCAT 


TCTCCAAAGC 


AATCGTCCAC 


TCTTGATTAT 


.1020 


TGCGGATGAT 


GTGGATGGCG 


AGGCTCTTCC 


AACTCTTGTT 


TTGAACAAGA 


TTCGTGGAAC 


1080 


CTTCAACGTA 


GTAGCAGTCA 


AGGCACCTGG 


TTTTGGTGAC 


CGTCGCAAAG 


CCATGCTTGA 


1140 


AGATATCGCC 


ATCTTAACAG 


GCGGAACAGT 


TATCACAGAA 


GACCTTGGTC 


TTGAGTTGAA 


1200 


AGATGCGACA 


ATTGAAGCTC 


TTGGTCAAGC 


AGCGAGAGTG 


ACCGTGGACA 


AAGATAGCAC 


1260 


GGTTATTGTA 


GAAGGTGCAG 


GAAATCCTGA 


AGCGATTTCT 


CACCGTGTTG 


CGGTTATCAA 


1320 


GTCTCAAATC 


GAAACTACAA 


CTTCTGAATT 


TGACCGTGAA 


AAATTGCAAG 


AACGCTTGGC 


1380 


CAAATTGTCA 


GGTGGTGTAG 


CGGTTATTAA 


GGTTGGAGCC 


GCAACTGAAA 


CTGAGTTGAA 


1440 


AGAAATGAAA 


CTCCGCATTG 


AAGATGCCCT 


CAACGCTACT 


CGTGCAGCTG 


TTGAAGAAGG 


1500 


TATTGTTGCA 


GGTGGTGGAA 


CAGCTCTTGC 


CAATGTGATT 


CCAGCTGTTG 


CTACCTTGGA 


1560 


ATTGACAGGA 


GATGAAGCAA 


CAGGACGTAA 


TATTGTTCTC 


CGTGCTTTGG 


AAGAACCCGT 


1620 


TCGTCAAATT 


GCTCACAATG 


CAGGATTTGA 


AGGATCTATC 


GTTATCGATC 


GTTTGAAAAA 


1680 


TGCTGAGCTT 


GGTATAGGAT 


TTAACGCAGC 


AACTGGCGAG 


TGGGTTAACA 


TGATTGATCA 


1740 


AGGTATCATT 


GATCCAGTTA 


AAGTGAGTCG 


TTCAGCCCTA 


CAAAATGCAG 


CATCTGTAGC 


1800 


CAGCTTGATT 


TTGACAACAG 


AAGCAGTCGT 


AGCCAATAAA 


CCAGAACCAG 


TAGCCCCAGC 


1860 


TCCAGCAATG 


GATCCAAGCA 


TGATGGGCGG 


GATGATGTAA 


GCTTTCTATA 


GAAAACAACT 


1920 


TATAAAAAAC 


ACAAAAGGAG 


GGAATGACTA ACCCTTCTTT 


TTATAGGCTC 


TTTGTCAACT 


1980 


GTAGTGGGTT 


GAAGTCAGCT 


AAGCTCGAGA 


AAGGACAAAT 


TTCGTCCTTT 


CTTTTTTGAT 


2040 


GTTCAAAGCG 


ATAAAAATCC 


GTTTTTTGAA 


GTTTTCAAAG 


TTTCGAAAAC 


CAAAGGCATT 


2100 


GCGCTTGATA 


AGTTTGATGA 


GATTATTGGT 


CGCTTCCGGT 


TTGGCGTTAG 


AATAGTGTAG 


2160 


TTGAAGGGCG 


TTGATAATCT 


TTTCTTTATC 


TTTGAGGAAG 


GTTTTAAAGA 


CAGTCTGAAA 


2220 


AATAGGATGA 


ACTTGCTTAA 


GATTGTCCTC 


AATAAGTCCG 


AAAAATTTCT 


CCGGTTCCTT 


2280 


ATTCTGAAAG 


TGAAACAGCA 


AGAGTTGATA 


GAGCTGATAG 


TGATGTTTCA 


AGTCTTGTGA 


2340 


ATAGCTCAAA 


AGCTTGTCTA 


AAATCTCTTT 


ATTGGTTAAA 


TGCATACGAA 


AAGTAGGACG 


2400 


ATAAAATCGC 


TTATCACTCA 


GTTTACGGCT 


ATCCTGTTGT 


ATGAGCTTCC 


AGTAGCGCTT 


2460 



WO 98/18931 



PCTAJS97/19588 



630 

GATAGCCTTG TATTCATGGG ATTTTCGATC CAATTGGTTC ATAATTTGAA CACGCACACG 2520 

ACTCATAGCA CGGCTAAGAT GTTGTACAAT GTGAAAGCGA TCCAACACGA TTTTAGCATT 2580 

CGGGAGTGAA ACAGTCTGGG AGACTGTTTC AGCCTGAGCC TAGAAATTTG AAAGCGAAGC 2640 

TGTTTAGCCA AGTCATAGTA AGGACTAAAC ATATCCATCG TAATGATTTT CACTTGACAA 2700 

CGAACGGCTC TATCGTAGCG AAGAAAGTGA TTTCGGATGA CAGCTTGTGT TCTGCCTTCA 2760 

AGAACAGTGA TAATATTAAG ATTATCAAAA TCTTGCGCAA TGAAACTCAT CTTTCCCTTA 2820 

GTGAAGGCAT ACTCATCCCA AGACATAATC TTTGGAAGCC GAGAAAAATC ATGCTCAAAG 2880 

TGAAAGTCAT TGAGCTTGCG AATGACAGTT GAAGTTGAAA TGGCCAGCTG ATGGGCAATA 2940 

TCAGTCATAG AAATTTTTTC AATTAACTTT TGAGCAATCT TTTGGTTGAT GATACGAGGG 3000 

ATTTGGTGAT TTTTCTTTAC CAGGGGAGTC TCAGCAACCA TCATTTTTGA ACAGTGATAG 3060 

CACTTGAAAC GACGCTTTCT AAGGAGAATT CTAGAAGGCA TACCAGTCGT TTCAAGATAA 3120 

GGAATTTTAG AAGGTTTTTG AAAGTCATAT TTCTTCAATT GGTTTCCGCA CTCAGGGCAA 3180 

GATGGGGCGT CGTAGTCCAG TTTGGCGATG ATTTCCTTGT GTGTATCCTT ATTGATGATG 3240 

TCTAAAATCT GGATATTAGG GTCTTTAATA TCGAGCAGTT TTGTGATAAA ATGTAATTGT 3300 

TCCATATGAA TCTTTCTAAT GAGTTGTTTT GTCGCTTTTC ATTATAGGTC ATATGGGACT 3360 

TTTTTTCTAC AACAAAATAG GCTCCATAAT ATCTATAAGG GATTTACCCA CTACAAATAT 3420 

TATAGAGCCG AAAATTCACA TCTAATATAT GCAGACTACT TTGAAATGAA ATTAAAAAAA 3480 

TTATTAAAGG ATGACACAAA AGTTTTTGAA AAATCTACAT TCAAATTTGT AGAAGGATAT 3540 

AAAATATACC TGACAGAATC TAAAGAATCT GGAATTAAAC AAATGGACAA TGTCATAAAA 3600 

TATTTTGAGT TTATTGAATC TAAAAGTATT GCTTTATATT TTCAAAAACG ATTAAATGAG 3660 

CTGATAGATT AAATAGCATT TTCTCTGTTG AGATATTGTT TTTAAAATAT TGTACTAAAT 3720 

GATTGATGCT ATGTGGAAAT ACAAAAAAAT GTTTTTGATA CGAAGTTGAC CTGTATTTTT 3780 

TATACTAATC ATTTTCGTAT TTTTTGTATT AAACGATATA AGTTTGTTGT AAACTTACAA 3840 

GGAATAAAGA CATTAAAAAA TAACAGTATA TCTATTTGTT TTATATATTT TACGAATTCT 3900 

GCATAAATCT CTTTCTAGTA ATGTGTTGTA ACTCTGCTAT AATAGATTTA TTCCTTTTTG 3960 

TGTTTACACA ATTTATTTTA TAGTACCAAA AAAGGTCAGG ATTTTGTTCC TGACCTTTGA 4020 

CAACTTTACC GATTCTTTAG TTCTACATAG CGCTTGTACC AAATGTTTAC ATAGGCTTCT 4080 

GAGAAAGGAC CACGTCCATT GTTAATCCAA TCAACAAGAA TTTTGACATG TTCTTTTAAA 4140 

ATATAGTCCA AGTCATCAGA ATAATTCATT TTGCGTTTGT GACGCTCGTA CTCTTCAACG 4200 

TCCAAGAGAC GTTTTTCCCC ATCTGTAAAA ATTTTAACAT CCAAATCGTA ATCAATATAC 4260 
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TTCAGTGCTT CTTCATCCAG ATAGTAGGGG CTAGCCATAT TGCAATAGTA AGAAGTTCCA 4320 

TTATCACGAA TCATGGCAAT GATATTAAAC CAATATTTCT TGTGAAAGTA AACAATAGCC 4380 

GGTTCTCGAG TGACCCAACG ACGACCATCA CTTTCGGTAA CAAGTGTATG ATCGTTGACA 4440 

CCAATAATGG CGTTTTCTGT TGTTTTTAGT ACCATGGTGT CCCGCCAAGT TCGGTGGAGA 4500 

CTCCCATCAT GCTTATAACT TTGAATTGTA ATAAAGTCGC CTTCTTTTGG AAGCTTCATA 4560 

ACTAACCAAC TTTCTACAAT TTATAAGTTT ATCATTTACT ATTGTACCAT AAAATTACCC 4620 

AAAATCTGTG AATTTCACTT GGAAATATTA AAGATATTCT CTAAGAGCGC TTGCTATATC 4680 

CGAAAAATCG TAGCCCTTTC GTGCTAAAAC TTGAGTTAAA CGCTGCTTCA GTTCGTATCC 4740 

TTCATACTTT CGGGCATACT TAGTATATTG CTTATCAAGT TCCTTGAAGA TGAGTTCCTG 4800 

AGTCGTTTCT TCATCAACTT GACTATCCAA TTCGTCAAAG GCAATTTTAG CATCAAAATA 4860 

AGAGAAGCCC TTGTTAGTCA AGTTCTGGAT AATCTTATCT TGCAGGGCAC GAGCTGGAAG 4920 

TTTTCCCTCA TATTTTTTCA ATAGTTTATT GGCTACACGT TGAGCAACTT CCGAAAAATC 4980 

AAAATCATTC AAGATTTCTT CTATAGTAGA TTTTGAAATT CCTTTTTGTG CTAATTTCTG 5040 

AGTCAGTACA TAAGGTCCCT TGTCTCCTGA AAGTTGATTG GCATTGATGA TAGCATAAGC 5100 

GTACTGGCTA TCATTAATCC ACTTCTCTTC TTTAAGATTA GCAATGACTT GAGAAACGAT 5160 

GTTTTCATTA ATATCATATT TTTTCAGATA TTCTCTGACC TCTTTTTCAG TACGTGCTTT 5220 

AAAGGATAAG TGGTAGAGGG CCAGATTCTT ACCATAAGAA AATTGAGCAA AGTCTTGAAT 5280 

CTCTTTCAAT TCCTCTTCGC TTATCACCTT ATCTCTCGAT AACATAAAAC GAACAATTGT 5340 

GTCTTCGGTG ATATAGCATT TGTCG 5365 
(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3636 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

TTTCCAGAAA GAAGTTGAGT AAAGTCTTTA TCAAAGAGAA TGACTTCCGT ATTGGAACTG 60 

ACATTAGGTT TTATTTCTAC TTTACTAGCG TCCGCCCTAG CATTTTCTAA ATCTTTAATC 120 

TCTTCTGTTG CCCTATTTAT AGCCAGCTGA ATAACTGCTT GAGGATTTTC ACTCAGTCCA 180 

TGAAGCTTAT CGTCCACCGA AGTATAAAGA CTCGAATGCA TGACTTGTAA AATAATCAGA 240 
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GTCATTGTAG AAAAAATCAG GGTVlAAfiAPft PPCA ArT'TV^r 1 rr , amaaaan»a 


ACTAAAGTCA 


300 


TCCGCATACC ATGTTTTTTT AAryp m n»i\p'ivi a a f & mnnvnuvn »R»»/^jim7ir>r> 
a i i i i a nftVil l iALlu AALA 1 Li 1 11 AAAAGATACC 


CAACACTACG 


360 


CAAAGTTTGC AAATTPTPTfl raiAifiTfir*p nw i r < <pnwnA an» Tinr<nmi»pp^* 


CTTTTGAAAC 


420 


ATAGACTTCG ACAAPPRA.AH m PP r P m P m J\ m P &r v P&Tr<Kt »m nr>rr>Mnik^iin 


GGTCAAAAAT 


480 


vtw.ui^,i in uviuuutni\.A V.A1 1 1 ItjAl 1 1 luAAGGAAA TAAACTAGTA 


AATCGAACTC 


540 


v»\-v-v_n\3^. nn 1 1 loalavi vj/Uj j, a.1 L 1 1 L AA(_ 1 1 1 AACG GTATTGGTTG 


ATAAATTAAC 


600 


v*ni>un iftHL lla l Aftb I tA AGGTGTTTTC ATTAAACtTC cctgaacgtt 


TGAGAAGGGC 


660 


CTGAATCCGC ATTTTAAGTT CTTCTAGGTA GAAAGGTTTG GTCAGATAAT 


CATCCGCTCC 


720 


laiji i-uuiAi LLA i'lj lVUvJ T TGTCATCCAA ACTTTCCTTG GCAGT CATAA 


TCAGAACTGG 


780 


TGTCGTAATT CCCTTTTCAC GC AATT CTTT TAAGACTTGG AAACCATTTT 


TTTCTGGCAA 


840 


CATCAAATCC AGCAAAATCA AGTCATAGAC ACCACTCTCA GCTTCG TAG A 


GACCTTCTTC 


900 


TCCATCAAAT ACCTG CATAA CATCCGCAAA ATCGTCTAAA AAGTCAAATA 


CTGAATTTGA 


960 


CAGACCTAGG TCATCCTCAA CCAATAAGAT TTTTATCATG AGAAACTCCT 


CCTTATTAAA 


1020 


ACTATTATAC CAAATTTGCC TTAAAAAAAA CTCAACTCTC TGCATTTTAC 


ATGAGATAGC 


1080 


TGAGTTTTCT TTTTATTTTA GGCTTATTTA TGCATTTCCG TATTGAAGAA 


CAACTGCTTC 


1140 


GACTGCAGCT TTTTCACGGC TAATCAAGTC AACACGCGCT GCAATTTCCT 


TGATTCCGAT 


1200 


AL-LuAitiiiA LbLiLlAAuAG CAAGGTCAGA AAGTTGCGGT TCAAAGAACT 


CCTTGTATTC 


1260 


CGCCAAGCGT TGCTGAGTCT TAAATACATG AGCAGGAAGG ATAACAAAGC 


TATCAAAGCT 


1320 


CATATCTCCT CCAAGGGCTG CCTTAATCCA AGCCCAGTTT TCACGCGCCC 


AAGACCAAGC 


1380 


TGTTTTCTGA GTTGCTTGAT GAGCTAGGAA TTGGTAATAC CAAGCAGACA 


AGTCCTGTGG 


1440 


TTTGACCACA AATTTGTCCT TCCAAGAAGT AATCAGGTTT TGGATATTAT 


CCGCATCTGT 


1500 


ACTGTATGCA AGAGCTGCTG CCAACTGGCG TTTAAAGACA GCATCTGTTG 


CGTGAGTATA 


1560 


AGTATCAAGA TAAAGTGCTA ACAAGTCTTT AGTCTCATGA TGTTTCATCT 


CATTAATCAG 


1620 


AACTTGTGAG CGAATAGCTG CTGGGAGTCC TGCAAGATTC TCCTTGTGTG 


TTGCGAAGAT 


1680 


TTGGCTAGCG ACTTGACTAG CTTCTGCATC ATTTGAGCGA ATCATCATCG 


AAACAGCCAG 


1740 


CTGACGAACC AATTCATCCT CATCTGATTC TCCGTCTTTA GCTTCAAAAC CAAGACGGTC 


1800 


ATAGTTATGA CGAGCCAATT TAGCAACCAG TCCTTTGAAG GCTGTTTCAG 


CATCCGTTCC 


1860 


TTCATCAATA AAGCGCTCAA GGGCTGAAAT CACTTGAGAA ACAGCTGAAA 


CCACCAGATA 


1920 


AGACTCTTCC TTAGCAAGTT TATCAAGAAC TGGAAGCAAG TCTGCATAAG 


AAATGTGCCC 


1980 


TGCCTCAGCC AACAAACGAC GTTCTTGAAC AATTTGCAGT TTGCTTGTGT 


TATCAAGTGT 


2040 
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CTCTAGCTCA 


GCAAGAACAG 


CTGCTAACAA 


GTCTCCTTGA 


TAGTCGGTAA 


TATAGTGGGC 


2100 


AGTATTTTCA 


GTGTTGAGAC 


GAAGAGCTCC 


TTCATTTTCA 


GCAAGAAGAG 


CTGCGTAGCC 


2160 


AGGGATTTCG 


ATACTTTCAG 


TTTCGAGTGT 


ATCAGGCAAG 


CCTTTCCAGT 


TGCTATTGAG 


2220 


GGGCACCACC 


CAGAGACGGT 


TCTTGTCTTC 


GTTCTCACCG 


ATGAAGAATT 


GTTTTTGTGA 


2260 


AATCTTCAAG 


ACATCATTTT 


CAACTTTAAC 


AGTAAGAACT 


GGGTAACCAG 


GCTGTTCCAA 


2340 


CCAAGAATCC 


ATGAAGGCTG 


CGACATCACG 


TCCTGACGCT 


TGACCAAGGG 


CATCCCAAAG 


2400 


GTCACTACCA ATGGTGTTGC TGTATTGGTG 


TTTTTCAAAG TAGGCGTGCA AACCTTTAGC 


24 60 


AAAATCAGCA 


TCTCCTAGCC 


AACGGCGAAG 


CATGTGCATG 


AGACGGCTTC 


CTTTGGCATA 


2520 


GACGATAGCG 


CCGTCAAAGA 


GTGTATTGAT 


TTCATCTGGA 


TGTTTAACTT 


CGACGTGGAC 


2580, 


AGACTGAACG 


CCATCAGTAG 


CGTCACGTTC 


AAGAGCAAGA 


GGTACTCCAC 


CTGTTTGGAA 


2640 


ATCTTCAAAG 


ATATTCCAGC 


TTGGTTCGAT 


GGTATCCACA 


CAGACGTATT 


C CATC AT ATT 


2700 


AGCGAAACTT 


TCATTGAGCC 


AAAGGTCATC 


CCACCATTTC 


ATAGTCACGA 


GGTTCCCAAA 


2760 


CCATTGGTGA 


GCCAATTCAT 


GGGCCACAAC 


AAGGGCAACT 


TGTTGACGGC 


TAGCAAATGT 


2820 


AGAGTTCTCA 


TCGACAACCA 


AGTAAACTTC 


ACGGTAGGTC 


ACAAGACCCC 


AGTTTTCCAT 


2880 


AGCACCAGCT 


GAGAAGTCAG 


GAAGGGCGAT 


GTGGAGAGAT 


TGAGGAATTG 


GGTACTTAAC 


2940 


TCCATAGTAA 


TCTTCGTAAA 


ACTCGATAGA 


GCGAACAGCG 


ATATCCAGTG 


AGAAATCAAG 


3000 


ATTTGAAAGT 


GGATGTGCTT 


TGGTTGAGTA 


GACACCTACC 


AGGGTACCAT 


TTTTAGTTTT 


3060 


AGCGGTCACC 


CCTTGCAAAT 


CACCAGCAAC 


AAAGGCCAAC 


AAGTAAGAAG 


ACATGCGAGG 


3120 


TGTTGTCTCA 


AACTTCCAGA 


TACCTGTTTC 


CTTACGGTTT 


TCAACATCGA 


TTTCTGGCAT 


3180 


GTTTGACAAG 


GCCAATTCAC 


CTTCTGCTTG 


GTCAAAGCGA 


AGAGAGAGGT 


CAAAAGTTGC 


3240 


TTTGGCTTCA 


GGCTCATCCA 


CACATGGGAA 


AGcTTCGCGC 


GCAAAATGGC 


TCTCGAACTG 


3300 


AGTAGACAAG ACCTCCTTCT 


TGACTCCATC 


AACTGTATAA 


TAAGAAGGGT 


AAATCCCTGT 


3360 


CATGTTGTCT 


GTAATTTTAC 


CAGAAAAGGC 


AAGAACCAAT 


TCAACTTGAC 


CAGCCTCAGC 


3420 


CAATTCGATA 


TGAAGGGCTT 


CATTGTCATG 


GTCAACTGTA 


AATGGACGAG 


CTTGACCTGC 


3480 


AACTTCTACA 


GAGGTGATTT 


CCAAATCTTT 


TTGGTGGAGG 


GAGATGCGGT 


CACTCTGTGC 


3540 


TTGACCAGTG ATGGTCACTT TCCCAGAAAA AGTCTTGGTC TCACGACTCA AATCTAAAAA 


3600 


TAAATCATAA 


TGTTCAGGAA 


CAAATTGCTT 


AATGGG 






3636 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5066 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 



ATAGCGTGTA ATAATCGATT 


TTAGAGGTAC 


CATAAGCCAC 


CTCCTACAAA TAGAAACCGA 


60 


TATAAATCAA 


TGCCTTCCAC 


CCTTAGACTT 


CCCTAGTTCC 


TGTCTCAAGC 


GAAACATTTC 


120 


TTTGAAACAG 


GAATAAGTTA 


ACCAATTCAT 


ACCAATAGCT 


AGCAGAATAA 


AAAGAAACCA 


180 


AATGCCCCAT 


AACTTGATAT 


CTGTCACATT 


TCTCAAGACG 


GTATTGAAAA ACAGAACTGA 


240 


AACAACTGTC 


CAAGCAAGGC 


TAAAAAGAGA 


ATAGAAGGGG 


ATGTAAAACC 


AGTAAAAATA 


300 


ATAAAAAATT 


GGAAAAAACT 


TACTATTTCT 


GTTGGCCTTT 


TCAATCCAGT 


TATCAAAATA 


360 


AAAGTACGGT 


GCTAAAAGTA 


AGAATTTAAA 


CAAATGTTCC 


A1\.ALLL«ALA 


TCCCCCCTTC 


420 


TTTTGATAGC 




ATTTTATTAT 


ATCAAAAAAA 


TCCGGAACTG 


TCATTCCAGA 


480 


TTCTACTTTT 


TTATTTGCGT 


TTTCTTGCGA 


TGAGATGAAT 




TCAAAAACAA 


540 


AGGCCTTGCG 


GATTTGATTT 


TCCAAGAAAC 


GCAGGTAAGA 


AAAGTGCATG 


AGTTCTTCTT 


600 


CATTGACAAA 


GATGACAAAG 


GTTGGTGGTT 


TGGTTGCCAC 


TTGGGTCGCA 


TAGAAAATCT 


660 


TGAGACGTTT 


TCCTTTGTCT 


GTCGGTGTTG 


GGTTGATGGC 


AATGGCATCC 


ATGATGACAT 


720 




AGCTGATGGA 


ATACGTGTAT 


TTTGACTTTC 


GCTGATTTGC 


TTAATCATCT 


780 


CAGGAAGTTT 


GTGGAGACGT 


TGCTTGGTTA 


AAGCTGATAC 


AAAGATAATG 


GGTGCGTAAG 


640 


GCAGGTATTG 


GAACTGCTCA 


CGGATATCTT 


CTTCCCAGTT 


TTTCATAGTG 


TGGTTATCTT 


900 


TTTCAAGCGT 


ATCCCACTTG 


TTGACCACGA 


TAATCATCCC 


TTTACCAGCT 


TCATGGGCAA 


960 


ATCCTGCGAT 


ACGCTTGTCG 


TACTCACGAA 


TGCCTTCTTC 


CGCATTGATG 


ACCATCAAGA 


1020 


CCACATCTGA 


ACGGTCAATA 


GCACGCATGG 


CACGCATAAC 


AGAGTATTTC 


TCAGTATTTT 


1080 


CATAAACCTT 


ACCAGACTTA 


CGCATACCAG 


CCGTATCAAT 


CATGGTAAAC 


TCTTGACCAT 


1140 


CTGTATCTGT 


AAAGTGGGTA 


TCAATGGCAT 


CACGAGTTGT 


TCCAGCAACA 


GGACTAGCAA 


1200 


TAACACGGTC 


TTCTCCCAAG 


ATAGCATTGA 


TCAAGCTTGA 


TTTTCCAACG 


TTAGGACGAC 


1260 


CAATCAAGCT 


AAACTTAATG 


ACATCTGGAT 


TTTCTTCCTC 


ATATTCATTT 


GGAAGATTTT 


1320 


CTACGATCGC 


ATCTAGCACA 


TCCCCTGTAC 


CGATTCCATG 


GACAGATGAG 


ATAGGCAATG 


1380 


GTTCACCCAA 


ACCGAGAGCA 


TAGAAATCAT 


ATATATCATT 


TCTCATCTCA 


GGGTTGTCCA 


1440 


CCTTGTTGAC 


TGCGAGGATA ACTGGTTTGT 


GGGTCTTATA 


AAGCTTACGA 


GCTACGTATT 


1500 


CGTCTGCATC 


AGTAATTCCT 


TCCTTACCAG 


ACACGACAAA 


AACGATAACA 


TCTGCTTCTT 


1560 
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CCATGGCAAT TTCTGCCTGG TGCTTGATTT GTTCCATGAA AGGAGCATCG ACATCATCAA 1620 

TTCCTCCTGT ATCAATCATG CTAAAAGAAC GATTGAGCCA CTCACCCGTT GCATAAATAC 1680 

GGTCACGTGT CACTCCTTCG ACATCTTCTA CAATGGAGAT TCGCTCACCA GCGATCCGAT 1740 

TAAATAGGGT TGATTTCCCA ACATTGGGAC GTCCTACAAT GGCAATAGTT GGTAGGGCCA 1800 

TAATTTCTCA CTTTCTACAA TAATTTCTTC TGTTCAAGAT TTTTTCTAGT TGAGCTTGGT 1860 

TCAGCTTGAC CAAACTGTTC TGCTAGGCGC TGACTCCAGC TTGTGGTCGC ACGCGCCCCA 1920 

GCATAGTCAG CCTGAACACG GTCATAAGCT TGGATTGCCT CAGTTGACTG TTCTTGGTAT 1980 

TCTTCCTCAA AGACAACATT CTCTAGTGGC AGTCTCGGTT TCATATCATG ATGTTGATTT 2040 

GGCACACCCA GTGCCATCCC AAAGACAGAA TAGGTGTAGT CAGGTAGGTT AAAGAGCTCT 2100 

GCCACTTCTT CAGACTTGTA TCGAACCAAA CCGATAATCA CACCACCATA GCCCAAGCTT 2160 

TCAGCTGCCA ACAAGGCGTT TTGTCCAGCA AGAGCTGCAT CGACCGAACT AATCAAGAGA 2220 

CCTTCCACAC CTTGGGGTTG GAAGGTGTCG GTATGAAGTC GGGCTCCCTT TTCTGCTCGG 2280 

TTCAAATCTC CGACAAAGAG AAGGAAAACA GCAGACTGGC GAATGGCTTC TTGAGGTACC 2340 

AATTCATACA AGGCATCTTT CTTCTCTTGA CTTCGTACCA CAATCACAGA GTAGGATTGG 2400 

AAATTCTTCC AAGATGATGC CATCTGGGCT GCTGTCAAAA TCTCATTTAA GTCTACTTGG 2460 

GGAATTTCTT GCTCTTTAAA CCTGCGCACT GAAGTATGAG CCTTCATCAA TTTAATGGTT 2520 

TCTGTCATCG ACGGTTTACT CCTTCTAAAC GAGTCTCCTC AGCCAAATAA CGGATGCGTT 2580 

CCATGACCCG TCTGGCTTCC CAGGTTTCGT CATTTCCATG TTTCACTTTC GCAAAATGCT 2640 

TCTCCAAATC TTCAAAGTTG AAGTTGGATG TGAAAAAGGT CGGTAAATTT TCCTGCATCC 2700 

GATATTGGAG AATGACCTGC AGGATTTCGT CACGCACCCA AACGGTTGAT TGCTCGGCGC 2760 

CAATATCATC TAAAATCAGG ACCTCAGACA GCTTAATCTC ATCCACCAAG GTCTTAACAT 2820 

TGCCATCACT GATAGCATTT TTGACATCAA TGACAAAGCT AGGATAGTGG AGGAGAGTTG 2880 

ATGAAACACC ACGTTTTTCT GATAAATCAT GAGCTAAGGC CGCCACCATG AAACTTTTAC 2940 

CCACACCAAA GTCTCCATAT AAGTAAAGAC CTTTTCGAAT AGCTGGATAT TGCTCCACGA 3000 

AGGCTAGTAG CTTTTCAAAA ACTGGTAAGC GCCCCAAATC ATCCAAGTCA ACTTGAGCCA 3060 

AACTAGCTTT CTTGAGACTG GCTGGTAGAT TGATTAACTT GAGACGGTTC TTAATAGCCG 3120 

CTTCTTTTTC AGCCGCGATT AGCTCAGGAG TTTCTTCATA TGAAACATCT GCATAACCAT 3180 

GATTGTTAAC CAAAATCGGC TTGTAGCCTT TGGCAATATA ATCCGTATCG GCACGGAGAA 3240 

ACTTGTCACG CTCGGTGATG TACTGATTAA ACTTGGAGAT ACTGCGATTT AATTCCTTTG 3300 
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GAGTTAAGGA TTCTTGCTGG ATAAAGGCCG CAACATCAGG GTCCTTCATG ATTTTCTGGA 3360 

CCAAATCTTG ATAATAAAAA CGGCTGGGTT GACGTTTGAG TACGTCTCCG ACACTTTCCA 3420 

TCTAATCTCC TCCTTTTTCT AATCGAGCTA ATAGTTCTTG CTTCTTACGT TCTAGTTCCA 3480 

GACGAGTTTC CTCGCTGGTT TCATTCTTAT ATTCAGGATT ACTCCATTTA GGAACATTGG 3540 

TTTTTTCTGG GGCAGTCTGA TTCTGTTTTT GTGTTTTTGC TTTCTGCCCT CGATCACGAA 3600 

TTCGTAAAAC GGCCTCTTCT GCCGAATGAA TCTTTTGATA GGCATAGTCA TTGGCTACCT 3660 

TCATGGCATA TTTCTCATTG ATATTTGCCG AATCCACCTT ATTAAAGGTC AATAAGAGAA 3720 

TAATATTGAT GACTTCGTCC AGTAAGCCCA AGCCAGCCAT CTGTTGCAAG AGTTCTCTTT 3780 

CTGTTTGGGT AATGGTTCCC TTGCGTGTTT GCTTGATTTC TGCTAAGAAC TGCAGGGCAG 3840 

TTTTACTTTT AGCTTCTTTG ATAATGGTCG CTTCCTTAAG ACTAAAGTCA GAGGAAACTG 3900 

GTTTTTGAGC AATTTTTTCA CGCATGCGTT TGGTTGAAAT AACCTGGGAA ACAGCTGTTG 3960 

ACTTGGCCAA TTGATAGGTT TCAAACCAAG TCCATTTCTT CTCCTCGGCA ATAGCAAAGA 4020 

GGTTTAAGAC ATCGGACTGC TCATCCGCAA AACGAAGTCC ATCTCGAGCC ATCAGCTGGC 4080 

GAAAATGTTC CAAGTCAAAA TCATTGGCCA CTTTCTTCTT GAGACCAAGG TCTTCTTGAC 4140 

TGCCTAGTTC TGCCAATTCT GGAAAGACTT GATTGAGTGA GACAGGTATT TCTTCACCAT 4200 

CAGCACTTTC AACTTTCAAA TCCTCCACAG CTACATCGCC AATCTTTTTC TCTAAGAGTC 4260 

TGCGATAAAC AGGATGCCGC AAGAAGTCTT GACTAGATAG AGGAGCATGG AGGGCTAGCT 4320 

GATAAACATC ACCCTTTTGA TAGAGGGTCA AGAGATTAAA AGCAGATAAG ATTTTCAATG 4380 

ATTTTATCAG TCTATCCATC CCAAAGTTGA GATGGTTGAG AATGCTTGAA AAAAGATATT 4440 

CCTTTCTACC ATTATCCCAA AAACTGATTG TATAAAGATA AAGGCTCAGT GCCTCCTGAC 4500 

CGATAATCGG GAGGTAGCAC TGTACCAGAG ATGAGGTATC TTGCGACACC CGATTATTCT 4560 

TTAGATAAGA AAAACGGTCA ATTGGCTTCA TTTATCTTTC CTTTTTCTTT TTAGAGGACT 4620 

GGGTGATTTG TTGGAGCAAG CTCTCTAACT CACTGACATC CTTAAAACTA CGATAGACAC 4680 

TAGCAAAACG TACATAGGTA ATCTCGTCCA ATTCAGCCAA CTCCTCCATG ACGAGTGAAC 4740 

CAATGTCCTC ACTTTGAATT TCATTTTCAT TTCGACCACG GAGTTTCTGT TCGATACGAT 4800 

TGACTACCAT GTTGATTTCA TCACTTGACA CAGGACGTTT CTGGGCTGAG CGGATAATCC 4860 

CATTAAAGAT TTTATCTCTG GAGAATTGTT CCCGTGTGCC ATCTTTTTTA ACAACCACTA 4920 

AGGTTCTTTC TTCTACTCGT TCGTAGGTTG TAAAACGGTG TTGGCATTCG TCGCACTCAC 4980 

GTCTTCTACG AATGGTGTTC CCTTCTTCTG CTTGGCGACT ATCGATAACA CTTGACTTGG 5040 

TAGCCCCACA TTTTGGACAG GGTACC snfifi 
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(2) INFORMATION FOR SEQ ID NO: 80: 

{i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9607 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 



CACTTGAAGT ATTTGAAACA GCTATGGAAA ACATCATGCC TGTACTTGAA GTACGTGCAC 


60 


GTCGTGTTGG 


TGGTTCTAAC 


TACCAAGTCC 


CAGTTGAAGT 


TCGTCCAGAA 


CGTCGTACAA 


120 


CACTTGGACT 


TCGTTGGTTG 


GTAACAATCG 


CTCGTCTTCG 


TGGTGAACAC 


ACAATGCAAG 


180 


ACCGTCTTGC 


AAAAGAAATC 


TTGGATGCTG 


CTAACAACAC 


TGGTGCAGCA 


GTTAAGAAAC 


240 


GTGAAGATAC 


TCACCGTATG 


GCTGAAGCTA 


ACCCTGCATT 


CGCACACTTC 


CGTTGGTAAG 


300 


ATAGGATGCG 


AAAGCGTTAA 


GAAAGTCCCA 


GAGAAAATAG 


GGAATCGAAG 


CAGGTTGCGG 


360 


TTGCAACCAA 


TGAGATTCAT 


CTTTTTCTCC 


AGACTTTTAG 


CTTGAGCTCA 


ACTAAATCAT 


420 


GATGCTAGGA 


ACGGTAAGGA 


TGCAAGGTAA 


AAATAGGAAA 


CTGACGCAGT 


ATTCGACGAA 


480 


TACAAGGAGT 


TTTATCTTTT 


TCACGCAGCA 


TCCCGTTCCA 


GCTCACATCG 


GCTAACTAAC 


540 


TTTAGCCCGG 


GTTCAAATTA 


GCTAAATCGA 


TTAGTATTAG 


CTATAACTCA 


GCTTACCATC 


600 


TCGTAAGTTG 


AAACCAACAA 


TAGCATGAAA 


ACATTGAGAA 


CGGGTAGGTC 


CTGCCTATCC 


660 


GTTTTTATTA 


AAATCGTGTT 


ATAATAGAAT 


AGAAATCAAA 


AATAAATAGG 


AGAAACAAAC 


720 


CTCATGGCAC 


GCGAATTTTC 


ACTTGAAAAA 


ACTCGTAATA 


TCGGTATCAT 


GGCTCACGTC 


780 


GATGCCGGTA 


AAACAACAAC 


TACTGAGCGT 


ATTCTTTACT 


ACACTGGTAA 


AATCCACAAA 


840 


ATCGGTGAAA 


CTCACGAAGG 


TGCGTCACAA 


ATGGACTGGA 


TGGAGCAAGA 


GCAAGAACGT 


900 


GGTATCACGA 


TCACATCTGC 


TGCGACGACA 


GCTCAATGGA 


ACAACCACCG 


CGTAAACATC 


960 


ATCGACACAC 


CAGGACACGT 


GGACTTCACA 


ATCGAAGTAC 


AACGTTCTCT 


TCGTGTATTG 


1020 


GATGGTGCGG 


TTACCGTTCT 


TGACTCACAA 


TCAGGTGTTG 


AGCCTCAAAC 


TGAAACAGTT 


1080 


TGGCGTCAAG 


CAACTGAGTA 


CGGAGTTCCA 


CGTATCGTAT 


TTGCCAACAA 


AATGGACAAA 


1140 


ATCGGTGCTG 


ACTTCCTTTA 


CTCTGTAAGC 


ACACTTCACG 


ATCGTCTTCA 


AGCAAATGCA 


1200 


CACCCAATCC AATTGCCAAT CGGTTCTGAA GATGACTTCC GTGGTATCAT TGACTTGATC 


1260 


AAGATGAAAG 


CTGAAATCTA 


TACTAACGAC 


CTTGGTACGG ATATCCTTGA AGAAGACATC 


1320 


CCAGCTGAAT ACCTTGACCA AGCTCAAGAA TACCGTGAAA AATTGATTGA AGCAGTTGCT 


1380 * 
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GAAACTGACG AAGAATTGAT GATGAAATAC CTCGAAGGTG AAGAAATCAC TAACGAAGAA 1440 

TTGAAAGCTG GTATCCGTAA AGCGACTATC AACGTTGAAT TCTTCCCAGT ATTGTGTGGT 1500 

TCAGCCTTCA AAAACAAAGG TGTTCAATTG ATGCTTGATG CGGTTATCGA CTACCTTCCA 1560 

AGCCCACTTG ACATCCCAGC AATCAAAGGT ATTAACCCAG ATACAGACGC TGAAGAAATT 1620 

CGTCCAGCAT CTGACGAAGA GCCATTTGCA GCTCTTGCCT TCAAGATCAT GACTGACCCA 1680 

TTCGTAGGTC GTTTGACATT CTTCCGTGTT TACTCAGGTG TTCTTCAATC AGGTTCATAC 1740 

GTATTGAATA CTTCTAAAGG TAAACGTGAA CGTATCGGAC GTATCCTTCA AATGCACGCT 1600 

AACAGCCGTC AAGAAATCGA CACTGTTTAC TCAGGTGATA TCGCTGCTGC CGTTGGTTTG 1860 

AAAGATACTA CAACTGGTGA CTCATTGACA GATGAAAAAG CTAAAATCAT CCTTGAGTCA 1920 

ATCAACGTTC CAGAACCAGT TATCCAATTG ATGGTTGAGC CAAAATCTAA AGCTGACCAA 1980 

GACAAGATGG GTATCGCCCT TCAAAAATTG GCTGAAGAAG ATGCAACATT CCGCGTTGAA 2040 

ACAAACGTTG AAACTGGTGA AACAGTTATC TCAGGTATGG GTGAACTTCA CCTTGACGTC 2100 

CTTGTTGATC GTATGCGTCG TGAGTTCAAA GTTGAAGCGA ACGTAGGTGC TCCTCAAGTA 2160 

TCTTACCGTG AAACATTCCG CGCTTCTACT CAAGCACGTG GATTCTTCAA ACGTCAGTCT 2220 

GGTGGTAAAG GTCAATTCGG TGATGTATGG ATTGAATTTA CTCCAAACGA AGAAGGTAAA 2280 

GGATTCGAAT TCGAAAACGC AATCGTCGGT GGTGTGGTTC CTCGTGAATT TATCCCAGCG 2340 

GTTGAAAAAG GTTTGGTAGA ATCTATGGCT AACGGTGTTC TTGCAGGTTA CCCAATGGTT 2400 

GACGTTAAAG CTAAGCTTTA TGATGGTTCA TATCACGATG TCGACTCATC TGAAACTGCC 2460 

TTCAAGATTG CGGCTTCACT TTCCCTTAAA GAAGCTGCTA AATCAGCACA ACCAGCTATC 2520 

CTTGAACCAA TGATGCTTGT AACAATCACT GTTCCAGAAG AAAACCTTGG TGATGTTATG 2580 

GGTCACGTAA CTGCTCGTCG TGGACGTGTA GATGGTATGG AAGCACACGG TAACAGCCAA 2640 

ATCGTTCGTG CTTACGTTCC ACTTGCTGAA ATGTTCGGTT ACGCAACAGT TCTTCGTTCT 2700 

GCATCTCAAG GACGTGGTAC ATTCATGATG GTATTTGACC ACTACGAAGA TGTACCTAAG 2760 

TCAGTACAAG AAGAAATTAT TAAGAAAAAT AAAGGTGAAG ACTAATCCGT CCTCACTCTA 2820 

GAAGGAAGTC ACTTAGTGGC TTCCTTTTGT CTTTAGAAAA TACCTCTAAA TATGGTAAAA 2880 

TAGTAGAAGA ATAATGTGAG GAAAATGAAT GTCAAATAGT TTTGAAATTT TGATGAATCA 2940 

ATTGGGGATG CCTGCTGAAA TGAGACAGGC TCCTGCTTTA GCACAGGCCA ATATTGAGCG 3000 

AGTTGTGGTT CATAAAATTA GTAAGGTATG GGAGTTTCAT TTCGTATTTT CTAATATTTT 3060 

ACCGATTGAA ATCTTTTTAG AATTAAAGAA AGGTTTGAGC GAAGAATTTT CTAAGACAGG 3120 

CAATAAAGCT GTTTTTGAAA TTAAGGCTCG GTCTCAAGAA TTTTCAAATC AGCTCTTGCA 3180 
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GTCCTACTAT 


AGGGAGGCTT 


TCTCTGAAGG 


TCCATGTGCT 


AGTCAAGGTT 


TTAAGTCCCT 


3240 


TTATCAAAAT 


TTGCAAGTTC 


GTGCTGAGGG 


TAATCAGCTA TTTATTGAAG 


GATCTGAAGC 


3300 


GATTGATAAG 


GAACATTTTA AGAAGAATCA TCTTCCTAAT TTAGCCAAAC AACTTGAAAA 


3360 


GTTTGGTTTT 


CCAACTTTTA 


ACTGTCAAGT 


CGAGAAGAAT 


GATGTCCTGA 


CCCAAGAGCA 


3420 


GGAAGAGGCC 


TTTCATGCTG 


AAAATGAGCA 


GATTGTTCAA 


GCTGCCAATG 


AGGAAGCGCT 


3480 


CCGTGCTATG 


GAACAACTGG 


AGCAGATGGC 


ACCTCCTCCA 


GCGGAAGAGA 


AACCAGCCTT 


3540 


TGATTTTCAA 


GCGAAAAAAG 


CTGCAGCTAA 


ACCCAAGCTG 


GATAAGGCGG 


AGATTACTCC 


3600 


TATGATCGAA 


GTGACGACAG 


AGGAAAATCG 


TCTGGTATTT 


GAAGGGGTTG 


TTTTTGATGT 


3660 


GGAGCAAAAA 


GTGACTAGAA 


CAGGTCGTGT 


TTTAATCAAC 


TTTAAAATGA 


CGGACTATAC 


3720- 


TTCAAGTTTT 


TCTATGCAAA 


AGTGGGTTAA 


AAACGAGGAA 


GAGGCCCAGA 


AGTTTGACCT 


3780 


CATCAAGAAG 


AATTCTTGGC 


TCCGAGTTCG 


AGGGAATGTG 


GAGATGAATA 


ACTTCACACG 


3840 


CGATTTGACT 


ATGAACGTAC 


AGGATCTGCA 


GGAAGTTGTT 


CACTATGAGC 


GGAAGGATTT 


3900 


GATGCCAGAA 


GGTGAGCGTC 


GGGTTGAGTT 


TCATGCTCAT 


ACTAACATGT 


CGACTATGGA 


3960 


TGCTTTGCCA 


GAGGTCGAAG 


AGATTGTTGC 


AACAGCTGCT 


AAGTGGGGAC 


ACAAGGCGGT 


4020 


TGCTATCACG 


GACCATGGGA ATGTCCAGTC 


CTTTCCACAT 


GGCTATAAGG 


CGGCTAAGAA 


4080 


AGCGGGAATC 


CAGCTGATCT 


ATGGGATGGA 


AGCGAATATC 


GTGGAGGACC 


GTGTCCCTAT 


4140 


CGTCTATAAC 


GAAGTGGAGA 


TGGACTTGTC 


AGAAGCAACC 


TACGTGGTCT 


TTGACGTGGA 


4200 


AACGACGGGA 


CTTTCAGCTA 


TCTATAATGA 


CTTGATTCAG 


GTTGCGGCTT 


CTAAGATGTA 


4260' 


CAAGGGGAAT 


GTTATTGCTG 


AATTTGATGA 


ATTTATCAAT 


CCTGGGCATC 


CCTTGTCAGC 


4320 


CTTTACTACA 


GAGTTAACTG 


GAATTACAGA 


TGATCATGTC 


AAAAATGCCA 


AACCACTAGA 


4380 


ACAAGTTTTG 


CAAGAATTCC 


AAGAATTTTG 


CAAGGATACG 


GTCCTAGTTG 


CCCACAATGC 


4440 


TACCTTTGAC 


GTTGGCTTTA 


TGAATGCTAA 


TTATGAGCGG 


CATGATCTTC 


CAAAGATTAG 


4500 


TCAGCCAGTT 


ATTGATACGC 


TGGAGTTTGC 


TAGAAACCTC 


TATCCTGAGT 


ATAAACGCCA 


4560 


TGGTTTGGGG 


CCTTTGACCA 


AGCGTTTTGG 


TGTGGCCTTG 


GAACATCACC 


ACATGGCCAA 


4620 


CTACGATGCG 


GAAGCGACTG 


GTCGTCTGCT 


TTTCATCTTT 


ATCAAAGAGG 


TAGCAGAAAA 


4680 


ACATGGTGTG 


ACCGATTTAG 


CTAGACTCAA 


CATTGATCTA 


ATCAGTCCAG 


ATTCTTACAA 


4740 


AAAAGCTCGG 


ATCAAGCATG 


CGACCATCTA 


TGTCAAGAAT 


CAGGTAGGTC 


TAAAAAATAT 


4800 


CTTTAAGCTG 


GTTTCCTTGT 


CTAATACCAA 


GTATTTTGAA 


GGAGTGCGAC 


GGATTCCGAG 


4860 


AACGGTTCTA 


GATGCCCATC 


GAGAGGGCTT 


GATTTTAGGT 


TCAGCCTGTT 


CAGAGGGTGA 


4920 
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AGTTTTTGAC 


GTGGTCGTTT 


CTCAAGGTGT 


GGATGCGGCG 


GTTGAGGTGG 


CCAAGTATTA 


4980 


TGATTTTATC 


GAGGTCATGC 


CACCGGCTAT 


CTATGCACCC 


TTGATTGCCA 


AAGAGCAGGT 


5040 


CAAGGATATG 


GAGGAACTCC 


AGACCATTAT 


CAAGAGTTTG 


ATAGAGGTTG 


GAGACCGCCT 


5100 


TGGCAAGCCT 


GTTCTGGCTA 


CGGGAAATGT 


TCACTATATC 


GAACCGGAAG 


AAGAGATTTA ' 


5160 


TCGTGAAATT 


ATCGTCCGTA 


GTTTGGGACA 


GGGTGCGATG 


ATTAATCGAA 


CTATCGGTCA 


5220 


TGGTGAACAT 


GCCCAACCAG 


CACCACTTCC 


AAAGGCTCAT 


TTTCGAACGA 


CTAATGAGAT 


5280 


GTTGGATGAA 


TTTGCCTTTT 


TGGGAGAGGA 


ACTGGCTCGT 


AAACTGGTTA 


TTGAAAACAC 


5340 


CAATGCCTTG 


GCAGAAATAT 


TTGAATCCGT 


TGAAGTCGTT 


AAGGGTGACT 


TGTATACGCC 


5400 


TTTCATCGAC 


AAGGCTGAAG 


AAACAGTTGC 


TGAGTTGACC 


TATAAGAAAG 


CTTTTGAGAT 


5460 


TTATGGAAAT 


CCGCTGCCAG 


ATATTGTTGA 


TTTGCGGATT 


GAAAAAGAAT 


TAACATCCAT 


5520 


ACTGGGGAAT 


GGATTTGCTG 


TGATTTATCT 


GGCATCGCAG 


ATGCTGGTGC 


AACGTTCTAA 


5580 


TGAACGGGGT 


TATTTGGTTG 


GTTCTCGTGG 


GTCTGTCGGA 


TCTAGTTTCG 


TTGCGACCAT 


5640 


GATTGGGATT 


ACGGAGGTCA 


ATCCTCTCTC 


TCCTCACTAT 


GTCTGTGGTC 


AGTGTCAGTA 


5700 


CAGTGAGTTT 


ATCACAGATG 


GTTCGTACGG 


TTCAGGATTT 


GATATGCCCC 


ATAAGGACTG 


5760 


TCCAAACTGT 


GGTCACAAAC 


TCAGTAAAAA 


CGGACAGGAT 


ATTCCGTTTG 


AGACCTTCCT 


5820 


TGGTTTTGAT 


GGGGATAAGG 


TTCCTGATAT 


TGACTTGAAC 


TTCTCGGGAG 


AAGATCAGCC 


5880 


TAGCGCCCAC 


TTGGATGTGC 


GTGATATCTT 


TGGTGAAGAA 


TATGCCTTCC 


GTGCGGGAAC 


5940 


GGTTGGTACG 


GTAGCTGCCA 


AGACTGCCTA 


TGGATTTGTC 


AAAGGTTACG 


AGCGAGATTA 


6000 


TGGCAAGTTT 


TATCGTGATG 


CAGAAGTAGA 


ACGCCTCGCT 


CAAGGAGCGG 


CGGGTGTCAA 


6060 


GCGGACAACA 


GGCCAACACC 


CGGGGGGAAT 


CGTTGTTATT 


CCGAACTACA 


TGGATGTCTA 


6120 


CGATTTTACG 


CCTGTCCAGT 


ATCCAGCAGA 


TGATGTCACG 


GCTGAATGGC 


AGACCACTCA 


6180 


CTTTAACTTC 


CACGATATCG 


ATGAGAACGT 


CCTCAAACTC 


GATGTACTGG 


GACATGATGA 


6240 


TCCGACTATG 


ATTCGAAAAC 


TTCAGGATTT 


GTCTGGTATT 


GACCCTAATA 


AAATTCCTAT 


6300 


GGATGACGAA 


GGCGTGATGG 


CACTCTTTTC 


TGGGACTGAT 


GTGCTAGGGG 


TAACACCTGA 


6360 


ACAAATTGGA 


ACGCCTACGG 


GTATGTTGGG 


GATTCCAGAG 


TTTGGAACAA 


ATTTCGTACG 


6420 


TGGAATGGTA GACGAAACCC 


ATCCGACAAC 


CTTTGCGGAA 


TTGCTTCAGC 


TGTCTGGTCT 


6480 


GTCCCACGGT 


ACTGATGTTT 


GGTTGGGGAA 


TGCTCAGGAT 


CTGATTAAGC 


AAGGAATAGC 


6540 


GGACCTATCG 


ACTGTTATCG 


GTTGTCGGGA 


CGACATCATG 


GTTTACCTCA 


TGCATGCGGG 


6600 


TCTGGAACCT 


AAGATGGCCT 


TTACCATTAT 


GGAACGGGTA 


CGTAAGGGTT 


TGTGGCTAAA 


6660 


GATTTCAGAA 


GAGGAGAGAA 


ATGGCTATAT 


CGAAGCAATG 


AAGGCTAATA 


AGGTGCCAGA 


6720 
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GTGGTATATC 


GAATCCTGTG 


GGAAAATTAA GTACATGTTC CCTAAGGCCC ATGCGGCAGC 


6780 


CTACGTTATG 


ATGGCCTTGC 


GTGTAGCTTA CTTCAAGGTT 


CACCATCCTA TTTATTACTA 


6840 


CTGTGCTTAC 


TTCTCCATTC 


GTGCTAAGGC TTTTGATATC 


AAGACCATGG GTGCGGGCTT 


6900 


GGAGGTCATC AAGCGCAGAA 


TGGAAGAAAT CTCTGAAAAA 


CGGAAGAACA ATGAAGCCTC 


6960 


TAATGTGGAA ATCGATCTCT 


ATACAACTCT TGAGATTGTC 


AATGAGATGT GGGAACGAGG 


7020 


TTTCAAGTTT 


GGTAAATTAG 


ATCTCTACTG TAGTCAGGCG 


ACAGAGTTCC TCATCGACGG 


7080 


GGATACCCTT 


ATCCCACCAT 


TTGTAGCAAT GGATGGTCTG GGAGAGAACG TTGCCAAGCA 


7140 


ACTGGTGCGG 


GCGCGTGAAG AGGGAGAATT CCTCTCTAAA ACAGAACTAC GCAAGCGTGG 


7200 


TGGACTCTCA 


TCAACCTTGG 


TTGAAAAGAT GGATGAGATG 


GGTATTCTTG GAAATATGCC 


7260 


AGAGGATAAC 


CAGTTGAGTT 


TGTTTGATGA GTTGTTTTAA 


AAAATTGCTT AATAATCTAT 


7320 


TAAAAGAGGC 


TAACGTATAT 


CCAATAGATT TACATTAGCT 


TTCTTTTTTG TTAAAATAGT 


7380 


CTATGGAAAG 


AGGGTGAGAG 


TATGTCAAAG ATGAGTATAA 


GCATCCGTCT GGATAGTGAG 


7440 


GTTAAGGAGC 


AGGCCCAACA 


GGTGTTTAGT AATCTGGGAA 


TGGATATGAC AACAGCTATT 


7500 


AATATTTTCC 


TTCGTCAGGC 


AATTCAATAT CAGGGATTAC 


CTTTTGATGT TAGACTAGAC 


7560 


GAAAATCGGA 


AGTTGCTCCA 


AGCGTTAACG GATTTAGACC 


AAAATCGTAA TATGAGCCAG 


7620 


TCTTTTGAAT 


CAGTCTCAGA 


TTTGATGGAG GACTTACGTG 


CTTAAGATTC GTTATCATAA 


7680 


ACAGTTTAAA 


AAAGATTTTA 


AGTTGGCTAT GAAGCGTGGT 


TTGAAGGCAG AATTATTAGA 


7740 


AGAAGTTTTG 


AATTTTCTGG 


TTCAAGAAAA AGAACATCCT 


GCCAGAAATC GTGATCATTC 


7800 


ATTGACGGCA 


TCCAAGCATT 


TTCAAGGAGT TCGTGAATGC 


CATACCCAGC CAGATTGGCT 


7860 


TTTGGTTTAT 


AAAGTAGACA 


AGTCGGAATT GATTTTAAAT 


TTGCTGAGGA CAGGCAGTCA 


7920 


CAGTGATTTA 


TTTTAATCTA 


TTTTAAGGGG GTTCTCATGA 


AACTAAGAAT ATTTGCGGAA 


7980 


GATAAGCCGG 


CTAAGAAGGT 


ATTTGAATAT CAATTAGAAC 


TTGCTGATCG TACAATTCTT 


8040 


CTATCGACAG 


CACTCTTGTC 


AGGTGCTATT GCTTTAGCAG 


GAATCTTTTC TGCTTTGAAA 


' 8100 


GAAAAATAAA 


AATAGAAAAG 


AGAAAACAGA ATGGTTTTAC 


CAAATTTTAA AGAAAATCTA 


8160 


GAAAAATATG 


CGAAATTGTT 


GGTTGCGAAC GGAATTAACG 


TGCAACCTGG TCACACTTTG 


8220 


GCTCTCTCTA TTGATGTGGA GCAACGTGAA TTGGCACATC TAATCGTGAA AGAAGCTTAT 


8280 


GCCTTGGGTG 


CGCATGAGGT 


CATCGTTCAG TGGACAGATG 


ATGTGATTAA CCGTGAGAAA 


8340 


TTCCTCCATG 


CCCCGATGGA 


GCGTTTGGAC AATGTGCCAG 


AATACAAGAT TGCTGAGATG 


8400 


« AACT ATCTCT 


TGGAGAATAA 


GGCTAGCCGT CTTGGAGTTC 


GTTCATCTGA TCCAGGTGCC 


8460 
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TTGAACGGAG TGGACGCTGA CAAGCTTTCA GCTTCTGCTA AAGCTATGGG ACTTGCCATG 8520 

AAGCCTATGC GTATCGCAAC TCAATCTAAC AAGGTTAGCT GGACTGTAGC AGCTGCAGCA 8580 

GGACTTGAGT GGGCTAAGAA AGTCTTCCCA AATGCTGCGA GCGACGAAGA AGCAGTTGAT 8640 

TTCCTTTGGG ACCAAATTTT CAAAACTTGC CGTGTCTACG AAGCAGATCC TGTTAAGGCT 8700 

TGGGAGGAAC ATGCAGCCAT TCTCAAGAGC AAGGCCGATA TGCTTAATAA GGAGCAATTT 8760 

TCAGCCCTTC ACTACACAGC GCCAGGAACA GATTTAACAC TTGGTTTGCC AAAGAACCAC 8820 

GTTTGGGAAT CAGCTGGTGC TGTCAATGCA CAGGGCGAAG AATTCTTGCC AAATATGCCA 8880 

ACAGAAGAGG TCTTCACAGC GCCTGACTTC CGTCGTGCAG ATGGTTATGT CACTTCTACA 8940 

AAACCGCTTA GCTACAACGG AAATATCATT GAAGGCATTA AGGTGACCTT TAAGGATGGA 9000 

CAAATCGTAG ATATCACTGC TGAGAAGGGT GATCAGGTTA TGAAAGACCT TGTCTTTGAA 9060 

AATGCGGGTG CGCGTGCCTT GGGTGAATGT GCCTTGGTAC CAGATCCAAG TCCAATTTCT 9120 

CAGTCAGGCA TTACCTTCTT TAACACCCTT TTCGATGAAA ATGCGTCAAA CCACTTGGCT 9180 

ATCGGTGCAG CCTATGCGAC TAGCGTTGTT GATGGAGGGG AGATGAGCGA AGAGGAGCTT 9240 

GAAGCTGCAG GGCTTAACCG TTCAGATGTT CACGTAGACT TTATGATTGG TTCTAACCAA 9300 

ATGGATATCG ATGGTATTCG TGAGGATGGA ACGCGGGTAC CTCTTTTCCG TAATGGGAAT 9360 

TGGGCAAATT AAGGAGATAA TATGTTAGGA AGTATGTTCG TTGGTCTCCT AGTGGGATTT 9420 

TTAGCAGGTG CTATGACCAA TCGTGGAGAG CGAATGGGAT GTTTTGGAAA AATGTTTCTC 9480 

GGTTGGATCG GAGCCTTTCT AGGTCACTTG CTCTTTGGAA CTTGGGGGCC AGTTTTATCA 9540 

GGAACAGCTA TTATCCCAGC GATTTTAGGA GCCATGATTG TTTTAGCTAT TTTTTGGAGA 9600 
CGAGGAA 

9607 

(2) INFORMATION FOR SEQ ID NO: 81: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14231. base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

CTACAAGATA ATTCCAGCTA TAACATCCGC TATAATAGTA AGAGCGAGCT CTATGATAAG 60 

GCTCATTAGT TTCACCTCCT CTCACGAACC CATAGGAACG TAATCGGTAA CCGATGACAA 120 

AAATAGTATA CCACAATACA TTTAGATCAT CAAGGTCACT TAATTCTTGA AATATCAGAT 180 

CTAAGAGAAA AATCTTTAAA ATCAGAAAAA CGCATAATAT CAGGTGTGCA AAAACTTGAT 240 
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ACTATGCGTT 


TTATTGTGGG 


AAGGTTTACT 


CCATTTTCTC 


CTGAAATTGA 


GTTTTTGTCC 


300 


AGCCTCTGTT 


TTTAGGGTTG 


CTAAGAAAAT 


AATGTCATGT 


GGTGAATATT 


TGTAAATCAG 


360 


TCAGCAGACA 


GAACGATACT 


CTTCGAAAAT 


CTCTTCACAT 


CATGTCAGCT 


TCGTCTTTCC 


420 


GTATATATGT 


GACTGACTTC 


ATCAGTTCTA 


TCTACAACCT 


CAAAACAGTG 


TTTCGAGCTG 


480 


ACTTGATCAA 


TTTTCAAATC 


TGTACTTTGA 


GCAAGCTGAG 


ACTAGCTTCC 


TATTTGATTT . 


540 


TCATTGAATA 


TCAGAAACCC 


ATTCTCCATC 


AAATAATTCG 


ACTGCGTCTA 


ATAATTTTTG 


600 


ATCTGGCACG GTGTCTGAAA TAAAGGTTGT GTATTTGGAG AGGGGATTAA 


TTTTAAAAAA 


660 


TCCAGTCTTG 


TAAAATTTAG 


AACTATCAAT 


CAGTAAGATG 


GTTTCATGGG 


CTTTGTCAAT 


720 


AATATTCTTT 


TTTGAAATAG 


CTTGGCTGAG 


AGAAGCTTCA 


TAAACATATT 


GGTCATCAAT 


780 


ACCTCTTGCT 


GAACAAAATG 


CTAAATCGAT 


ATTAAAATGA 


TCTAATAAAG 


AATTTTCCTT 


840 


ATCATAGTTG 


ACCACGGAAC 


AGGATTGATG 


TTTGACCTCG 


CCAGATGTGA 


TAAAGATTTT 


900 


GGAGCTATCT 


TTAACAGTTT 


CAGATAGGGT 


TTGTGCAGTA 


TGTAAACCAT 


TTGTAAAAAT 


960 


AATCAAATTA 


TCAAGTTCAG 


AAAGATAGGG 


ACAGAGTTCG 


TAGACAGTAG 


TACTAGAATC 


1020 


TAGATAGATA 


CACATACCAG 


ACCGAATAAA 


GTCTTTAGCG 


AGACTAGCGA 


TTAGTCTTTT 


1080 


TTGCCTAGTA 


CTTTCTCCTT 


CACGTATTTG 


ATGAGAAAGT 


TCAATTGTGT 


TCATAGAGGA 


1140 


CAGGGTCACG 


TATCCGTGCT 


TTCTTTTGAT 


AAGACCTTGA 


TTTTCTAAGA 


AAATTAAATC 


1200 


ACGACGTAAG 


GTACTTGTGC 


TGGAGAAAGT 


GATTTCTGCC 


AGCTCTTTTA 


CGGCAATTCT 


1260 


TTTTTTCTTT 


TTGATAATTT 


CAATCAATTC 


AAGTACACGT 


TCATCTTTTA 


TCATAAGCTC 


1320 


CTCCTAATTT 


ATCATTTCAA 


CTATATTATA 


GCACAAATTG 


GAGGAATTTG 


AATTATTTTT 


1380 


ATGAATATTG 


GGTTAACATT 


TGAACATTAT 


TCAAGTAAGC 


GTTCACATAT 


TGAAAAAATA 


1440 


AAACGTGGGG 


ATTATAATAA 


AGTTAATCmA 


GGACGAAGAG 


AGAAGAAAAA 


TGGAAGCGGT 


1500 


TTTAGCAATA 


GATTTAGGTG 


CGACTTCTGG 


AAGAGCAATC 


GTTGGTTACC 


TTTCTGAAAA 


1560 


TAAACTAGTA 


ATGGAAGAAA 


TAAATCGCTT 


TTCTAATCTA 


CCTATTAGAG 


TAAAAGGGCA 


1620 


TTTATCTTGG 


GATATTGACT 


TTCTACTAGC 


TAAAATTCTT 


GAAAGTATCC 


GCTTGGCTAA 


1680 


TACTAGTTAC 


AAGATTTTAT 


CTATCGGTAT 


TGACACATGG 


GGAGTTGATT 


TTGGACTGAT 


1740 


TGATAATGAA 


GGTAAGCTGT 


TATTACAACC 


TGTTCATTAT 


CGTGATGAAA 


GAACAAAGGG 


1800 


AGTGTTAAAG GAAATATCTG AAATGACTGA ATTAGAAAAA CTGTATTCAG AGACAGGAAA 


1860 


TCAGATTATG 


GAGATAAATA 


CCTTGTTTCA 


ACTCTTTAAG 


GCACGTCAAG 


AATCTCCTGA 


1920 


CTCTTTCTAT 


AAGACCAATA 


AGATTCTTTT 


AATGCCAGAT 


TTGTTTAATT 


ATCTCTTGAC 


1980 
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AGGTAAGTTT GCTACAGAAA AAAGCATTGC TTCAACAACT CAATTATTTG ATCCTAGGAG 2040 

TCAAAATTGG AATCAGAATA TCTTAAAACT ATTTGAATTG GATTCATCTT TACTTCCTGA 2100 

AATTGTTTCA GAGGGAAATG TTCTTGGAAG GATAAAAGAG GAGTATGGTT TAGGCGATAT 2160 

TCCTGTTGTG AATGTTTGTA GTCATGATAC AGCAAGCGCG ATTGTCTCAG TACCTAAGAC 2220 

AGAAGGTAGT TTATTTATTT CATCAGGTAC TTGGTCTTTG GTTGGAGTGG AACTTACTTC 2280 

ACCGATTCTT ACTACCGAAT CCTTCAGTTA TGGATTTACA AATGAAGTCG GTAAAGATGG 2340 

AGTGATTACA TTTCTGAAGA ATTGTACAGG GTTGTGGATC ATAGAGGAAC TAAGACGTTC 2400 

ATTTGAACGA AGAGGGAAAG CCTATTCTTT TGATGATATT AGGACAATGG TGGAGAAAGA 2460 

AAAAGAAAAT CTTCCTCTGA TTGATACTGA ATCAACTGAA TTTGCAACAG AATCTGATAT 2520 

GCACAAGACT TTGACAGAAT ATCTAGCTTA TCATCATGAA ACTAGAGAGT GGACAGATGG 2580 

ACAACTATTT AAGATTGTTT ATGAAAGCCT AGCTGAAACG TATAGGAAAG CGATAGAGTT 2640 

ACTAGAAGAA CTAACTCATA AGGTTTATAA GAGGATATAT GTGATTGGAG GAGGTGCTAG 2700 

AGCCAGTTAC TTTAACCAAA TGATTGCTGA TAGAACTGGT AAAGAGGTTC TTACAGGTTT 2760 

GACTGAGGCT ACAGCTGTGG GGAATATTGT TGTGCAGCTC ATAGCTATGG GACAATTAAA 2820 

AGGGATGGAA GAGGCTCACC ATGTTATTGA GGAGTTTCTA CAATTAGAGA GTTATTACTC 2880 

CCAAAAGAAT TAAAAAGATT GAGAGTTTGT AAATTTGCCT CCCTCCCCCT TCTTAGCTTT 2940 

TGTGCAGGAA GGGGGGATAA TTGGTGAATT GAAAAATATT TAGTGTTTTG ATATGAGGAG 3000 

GACAAGGATG TCAGATGTAA AACAAGAATT AATTAAATAT GGTAAGAAGC TAGTAGAAAC 3060 

AGATTTGACG AAAGGAACAG GTGGGAATCT CAGOGTTTTC GATCGTGAAA AACAATTGAT 3120 

GGCAATTACC CCGTCGGGTA TTGATTTCTT TGAAATCAAA GAATCCGATA TTGTAGTGAT 3180 

GGATATTAAT GGAAATGTTG TAGAGGGAGA ACGCTTGCCA TCTAGCGAAT GGTATATGCA 3240 

TTTGATTCAA TATCAAACTC GTGATGATAT CGATGCAATT ATCCATGCTC ATACAACTTA 3300 

TGCAACAGTA TTAGCTTGTC TCAGAGAACC ACTTCCAGCG AGTCATTATA TGATTGCAGT 3360 

GGCAGGGAAA GATGTTCGGG TAGCTGAGTA TGCAACATAT GGCACGAAAG AATTGGCTGT 3420 

GAATGCAGCT AAAGCAATGG AAGGTCGTAG AGCAGTTTTA CTAGCGAATC ATGGAATTTT 3480 

AGCAGGTGCA CAAAATTTAT TGAATGCATT TAATATTGTT GAAGAAGTTG AATATTGTGC 3540 

AAAAATTTAT TGTTTAGCTA AGAATTTTGG AGAGCCAGTA GTTCTTCCTG ATGAGGAGAT 3600 

GGAATTGATG GCAGAAAAAT TTAAAACATA CGGTCAGAGA AAATAGGGAG GATATTAATG 3660 

TTAAAACATA TACCGAAAAA TATTTCTCCA GATTTATTGA AGACTTTAAT GGAAATGGGA 3720 

CATGGAGATG AAATAGTATT AGCTGACGCG AATTATCCTT CTGCCTCATG TGCAAATAAG 3780 
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CTAATTCGTT 


GTGATGGTGT 


AAATATTCCA GAATTATTAG ATTCCATTCT GTATTTAATG 


3840 


CCATTAGATA 


GTTACGTCGA 


TAGTTCAATT CAGTTTATGA ACGTTGTTTC GGGTGATGAT 


3900 


ATTCCTAAGA 


TATGGGGTAC 


CTATAGACAG ATGATTGAAG GTCATGGTAC AGATCTTAAA 


3960 


ACGATTACTT 


ATCTTAGAAG 


AGAAGACTTT TATGAACGTA GTAAGAAAGC TTATGCTATT 


4020 


GTTGCTACAG 


GAGAAACTTC 


ACTTTATGCT AATATTATCC TTAAGAAAGG AGTAGTTGTT 


4080 


GAAAGAGAAA ATGTTCAATA 


GAGGAATTTT AGTTGCCAGT CATGGTAATT TTGCTAGOGG 


4140 


AGCTCTCATG 


ACCGCAGAAA 


TGTTTGTTGG TGAGACAACA AATGATAGAG TTAGGACATT 


4200 


AGGTTTGATG 


CCTGGAGAGA 


ATATTGTAGA GTTTGAGCAT TATTTTAAAA ATCAAGTGGA 


4260 


TGAACTGTTA 


GACTCAAATC 


AAGAGGTTAT CGTTTTGACT GACTTGATTG GAGGAAGTCC 


4320 


TAATAATGTG 


GCTTTGTCAC 


GGTTTTTAAA TTTGGATTCA GTTGATATTG TAACAGGGTT 


4380 


TAATATCCCT 


CTCCTAGTGG AATTAATATC AAGTTATGAT TCAAAAATCA ATTTAGAAGA 


4440 


AATTGTTCAC 


AATGCTCAAA 


ATAGTTTGTT TAATGTTAAA CAACAACTTA ACGTAGAGGA 


4500 


GGAAGAAGAT 


TTATGTCTAT AGAGTTTGTT CGTATTGATG ACCGTCTGGT ACATGGTCAA 


4560 


GTTGTCACTA 


CGTGGCTAAA 


AAAGTATGAT ATTGAGCAAG TTATCATTGT TAATGATCGC 


4620 


ATCTCAGAAG 


ATAAAACACG 


ACAATCTATT TTAAAGATTT CTGCACCGGT AGGTTTAAAA 


4680 


ATTGTTTTCT 


TTAGTGTAAA 


ACGGTTTGTG GAAGTTTTAA ACTCTGTGCC AATAAAAAAG 


. 4740 


AGAACAATGC 


TGATATATAC 


AAATCCAAAA GATGTGTATG ATTCTATTGA AGGAAATTTA 


4800 


AAATTGGAGT 


ACCTCAATGT 


AGGACAGATG AGTAAAACGG AGGAAAATGA AAAGGTAACG 


4860 


GGAGGTGTAG CTCTAGGTGA AGAAGACAAA TATTATTTTA AGAAAATAGT TGATAAGGGA 


4920 


ACGAGAGTTG 


AAATTCAAAT 


GGTTCCTAAT GATAAAGTTA CAATGTTGGA AAAATTTTTA 


4980 


TAAAAATAAT 


TTAAGGAGGT 


ACAGTATATG CTATTCACAC AAGCATTACT GGTGACATTA 


5040 


GTTGGGATTA 


TTGCCACTAT 


TGACTATAAT GGACCGTTAT TTATGATTCA CCGTCCGTTA 


5100 


GTTACAAGTG 


CAATGGTTGG 


CTTAGTATTA GGAGATTTCA CCCAAGGTGT TCTTATTGGT 


5160 


TCAGCTCTTG AATTAACTTG GCTCGGTGTA ACAGGTATTG GAGGTTATAC TCCACCAGAT 


5220 


ACTATTTCAG 


GTGCGATTAT 


TGGTACTGCA TTTGGTATTT TATCTGGTCA AGGAGAAACT 


5280 


GCTGGTATCG 


CTATAGCAGT 


TCCAATTGCA GTTGCTACCC AACAGTTGGA TGTTCTTGCA 


5340 


AAAACTTTAG 


ATGTTTATTT 


TGTGAAAAAA GCTGATAATG ATGCTAAAAA CGGAGATTAT 


5400 


TCAAAGATCG 


GTTTTTATCA 


TTATTCAAGT TTGGTTTTAA TCACGTTATT TAAAATTGTA 


5460 


CCAATTTTCC 


TAGCTATTAT 


GCTTGGAGGG GAATATGTGG CAGACTTGTT TGCTAAGGTT 


5520 
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CCACCAATCG 


TTATGCAGGG ACTTAACTCT 


GCAGGTGCTT 


TACTACCTTC 


AATTGGTTTT 


5580 


GGTATGCTTT 


TAAATATGAT GCTCAAGAAA AATATGTGGG 


TATTCTTGTT 


GATTGGATTC 


5640 


ATTTGTTCTG 


TGTATGGAGG AATGTCAACC 


ATTGGGATCT 


CACTAGTTGG 


TATTGCGGTA 


5700 


GCATACTTCT 


ACGATATGAT TGGAAGCAAA 


CCACAAGAAA 


CAACTTCAAG 


TAGTGATGTT 


5760 


GAGGAGGATC TTGATCTATG ATGAATAATA AAGTAACTAA AGTTGAACTT 


AAAAAAGTTT 


5820 


TCAAACGAAG 


TTTTATGTAT GGTTCTTCAT 




unuAA -l\jl_Ao 


AACCTAGGTT 


5880 


TTCTATATAC 


AATTCTTCCA GTATTGAAAA 


ftnL 1 1\ I AV-\_L- 


AbALAAAbA I 


TCAGCTTCTC 


5940 


CTGCAATGAA ACGTCACCTT GAGTTTTTCA 


AlnLlLAllA 


AAt-AGCGGCA 


CCATTTATTC 


6000 


TTGGAGTTAC 


TTCCGCTATG GAAGAACAAG 


AMVjvaAAA I UA 


AGGTGCAGCT 


TCAATTACTG 


6060 


GTATTAAAGT 


TGGCTTGATG GGGCCACTGG 


V. 1\JU 1 V, 1 AIjLj 


A La A 1 AO I rTG 


TTCTGGCTGA 


6120 


CACTAGTTCC 


TATCTGTTTT AGTATTGGTG 


UUlLl 1A1 it. 


1 AAAUAL.GGC 


GGTGCTTTAG 


6180 


GTATCTTTAT 


CGCCTTAATA TTGTTTAATA 


1 1A1 1AA1 AI 


TCCTGTTAAA 


TATTTCGGTT 


6240 


TGAAATATGG 


GTATACTAAG GGTTCTAGTC 


IT* A TTT 1 & A n & 


AAA1 AATACA 


AAAGGAACAT 


6300 


TGAATCGCGT 


TACGAGTATG GCGACAGCAT 


X /VjvjOV- 1 AVj X 


AL 1 AtiTGGGT 


GGTTTGATTC 


6360 


CATCAATGGT 


TGGTATTAAT TTTGGATTAG 


AAl 1 X/UUiLA 


uuiiOvjAAL I I 


GTTATTTCTG 


6420 


TTCAAGAAAT 


GATTACAAAA TTAATTCCAG 


G ATTT ATCC C 


i A I i 1 1 V? 


ACTTTATTAA 


6480 


TGTGTAAATT 


AATTAGAAAA GGAAAGAATC 


CGGTTGTACT 


AATCTTTAGT 


GTTATGGCTA 


6540 


TTGGAGTTAT 


TCTAGTTGTT TTAGGAATTT 


TGAAGTAGTA 


GAAAGTGTGG 


AGGTGGTATT 


6600 


TGGGATATCA 


CCTCCATTTT GGAAGAGAGG 


TAAAGAGTGA 


AATTATGGTA 


TAAGAAAGCT 


6660 


GCCGCAAATT GGAATGAAGC CTTGCCGATT 


GGGAACGGTC 


ATTTAGGTGG 


TATGATTTAT 


6720 


GGTTCAGCTA 


CAAAAGAATG TATTCAACTA 


AACGATGAGA 


CTATTTGGTA 


TAGAGGAAAG 


6780 


TCAGATAGAA 


ATAATCCAGA CTCACTATTG 


CATCTTAAAA 


AAATTCGGGA 


ATATCTTTTA 


6840 


GATGGAGAAA TTCAGAAAGC CGAAGAATTG 


ATAAAGTTAA 


CAGTGTTTGC 


TACCCCAAGA 


6900 


GATCAAAGCC 


ACTATGAATT ACTTGGGGAA 


CTTTACATTG 


AGCATATAGA 


TATTCAGTCT 


6960 


TGTGCTCTTT 


CATTGTATGA AAGAGAGCTA 


GATTTAGATA 


CAGCTATTTC 


TAATGTTGTG 


7020 


TTTGAGCCTA ATAGTTGTAA TTTACAAATA 


AAAAGAGAAT 


ATTTTACGAG 


TTTTAATAAG 


7080 


AATATTTTAT 


GTTGCCGTAT AGTGTCATCA 


GTTCAAAACA 


CATTAAATTT 


AAACATTAAT 


7140 


TTGGGTAGAA ATAAACGGTT TAATGACGAA 


GTATCTAAAC 


TGGATTCAAG 


TACAATTTTA 


7200 


ATGTCGGCCT 


CTGCTGGAGG TAGAAAAGGT 


GTTCAGTTTA 


AAGTAGTATG 


TCATTCTAAG 


7260 


GTTACGGATG 


GTGAAGTAAG TGTATTGGGA GAGACAATAG TTATTCGGAA TGCTACAGAG 


7320 
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GTATTTCTTT ATCTCAAATC AATGACGGAT TATTGGGGAA ATATAGATAT TTCTTCTCTT 7380 

CAGGGAGAAT TTAGTAGTAT TGATTACTTT ACAGAAAAAG ATGAACATGT AAAAAAATAT 7440 

CAGGAGCAAT TTAATAGAGT TGATTTTAAA CTAGACTATA GTAAAGGTTG TCTTAGCATT 7500 

CCAACGAATC TACTTCTTGA AAACACTAAA AAGTATAGTA ACTACTTGAC TAACTTGTTA 7560 

TTTCATTATG GAAGATATCT GTTAATATCG TCTAGTCAAC CGAATGGTTT ACCTGCCAAT 7620 

CTTCAAGGAA TATGGTGTGA TGAATTAAAT CCAATTTGGG GTTCTAAATA TACGATTAAT 7680 

ATTAATACTC AAATGAATTA TTGGATGGTA GGTCCATGTG ATTTACCAGA AGTAGAATAT 7740 

CCATTATTTG ATATGCTCGA AAGAATGAGA GAACCGGGAA GACTAACCGC TAAGAAAATG 7800 

TATGGAGCTA GAGGTTTTAC AGCACATCAT AATACGGATG GTTTTGGCGA TACGGCTCCC 7860 

CAATCTCATG CCATGGGGGC TGCAATTTGG GTATTAACTA TTCCATGGTT ATGTACTCAT 7920 

ATTTGGGAAC ACTATTTATA TTTCCAAGAT GAGCGTATTC TTACGGAACA TTTTGAAATG 7980 

ATAAAAGAAG CATTTCTTTT CTTTGAAGAT TATTTATTTG AGGTGGATGG CTACTTGATG 8040 

ACAGGTCCAA GTGTCTCACC GGAAAATAAA TATCGCTTAA AAAATGGTAT TGAAGGAAAT 8100 

GCTTGTCTAT CATCTACAAT TGATAATCAA ATTCTAAGAT ATTTTTGTGA TTCATGCATT 8160 

GGCATTGCAA AACAATTAGG AGACAATTCG GATTTTATTA GTCGTGTGAA GGAGTTAAAA 8220 

AAGAAACTAC CTAAAACAAA AATAGGTAGT AATGGGCAAA TCCAAGAATG GTTAGAAGAT 8280 

TATGAAGAAG TAGAGCCTGG GCATAGACAC ATTTCACCTC TATTTGGGCT TTATCCTTAT 8340 

AATGAGATTG ATATTCATAA AACTCCGGAA TTAGCAGAAG CAGGTAAAAT CACTATCAAT 8400 

AGGAGATTAT CAAACGCTAA TTTTTTATCT TCACAGGAGA GGGAGCAAGC GATTAATAAT 6460 

TGGTTAGTAA GTGGTTTGCA TGCTAGTACA CAAACAGGTT GGAGTGCTGC ATGGCTGATT 8520 

CATTTTTTTG CGAGACTATA TCAAGGTGAA CCTGCTTATA ACCAGATTAA TGGTTTGTTA 8580 

AATAATGCGA CTCTTGGCAA TTTATTTCTT GACCATCCAG CATTTGAAAT TGATGGTAAT 8640 

TTAGGTTTGG TGAGTGGAAT TTGTGAATTA TTAGTACAGA GCCATGATAA TTGGTTATCA 8700 

CTAATTCCAG CTTTACCTTC TGCTTGGTCA GAAGGAGAAG TGAAAGGTTT CAGAGTAAGA 8760 

GGAGGATATA AGGTATCGTT TGCTTGGAAA AATGGGGATA TAACATTGCT AAAATTGGAA 8820 

GGAGGAAACA AAGATCAAAA AGTAAGAGTA AGAATATATG GCAAAAATAC TGATGTACAA 8880 

AATATTGAAT TGGTATTTAA TTCAGAAAAA ATTATTGAGT TAAATTTTTA GGTATAAGTC 8940 

ATGAATAAAG AAAAAATAAA AAGAAAATTA ATCACAATAT TGTTTGTATG TATTGGGATG 9000 

TTATGTTTTG GATTGTTAGC AGGAGTTAAG GCTGATAATG GTGTTCAAAT GAGAACGACG 9060 



WO 98/18931 



PCT/US97/19588 



648 

ATTAATAATG AATCGCCATT GTTGCTTTCT CCGTTGTATG GCAATGATAA TGGTAACGGA 9120 

TTATGGTGGG GGAACACATT GAAGGGAGCA TGGGAAGCTA TTCCTGAAGA TGTAAAGCCA 9180 

TATGCAGCGA TTGAACTTCA TCCTGCAAAA GTCTGTAAAC CAACAAGTTG TATTCCACGA 9240 

GATACGAAAG AATTGAGAGA ATGGTATGTC AAGATGTTGG AGGAAGCTCA AAGTCTAAAC 9300 

ATTCCAGTTT TCTTGGTTAT TATGTCGGCT GGAGAGCGTA ATACAGTTCC TCCAGAGTGG 9360 

TTAGATGAAC AATTCCAAAA GTATAGTGTG TTAAAAGGTG TTTTAAATAT TGAGAATTAT 9420 

TGGATTTACA ATAACCAGTT AGCTCCGCAT AGTGCTAAAT ATTTGGAAGT TTGTGCCAAA 9480 

TATGGAGCGC ATTTTATCTG GCATGATCAT GAAAAATGGT TCTGGGAAAC TATTATGAAT 9540 

GATCCGACAT TCTTTGAAGC GAGTCAAAAA TATCATAAAA ATTTGGTGTT GGCAACTAAA 9600 

AATACGCCAA TAAGAGATGA TGCGGGTACA GATTCTATCG TTAGTGGATT TTGGTTGAGT 9660 

GGCTTATGTG ATAACTGGGG CTCATCAACA GATACATGGA AATGGTGGGA AAAACATTAT 9720 

ACAAACACAT TTGAAACTGG AAGAGCTAGG GATATGAGAT CCTATGCATC GGAACCAGAA 9780 

TCAATGATTG CTATGGAAAT GATGAATGTA TATACTGGGG GAGGCACAGT TTATAATTTC 9840 L 

GAATGTGCCG CGTATACATT TATGACAAAT GATGTACCAA CTCCAGCATT TACTAAAGGT 9900 

ATTATTCCTT TCTTTAGACA TGCTATACAA AATCCAGCTC CAAGTAAGGA AGAAGTTGTA 9960 

AATAGAACAA AAGCTGTATT TTGGAATGGA GAAGGTAGGA TTAGTTCATT AAACGGATTT 10020 

TATCAAGGAC TTTATTCGAA TGATGAAACA ATGCCTTTAT ATAATAATGG GAGATATCAT 10080 

ATTCTTCCTG TAATACATGA GAAAATTGAT AAGGAAAAGA TTTCATCTAT ATTCCCTAAT 10140 

GCAAAAATTT TGACTAAAAA TAGTGAGGAA TTGTCTAGTA AAGTCAACTA TTTAAACTCG 10200 

CTTTATCCAA AACTTTATGA AGGAGATGGG TATGCTCAGC GTGTAGGTAA TTCCTGGTAT 10260 

ATTTATAATA GTAATGCTAA TATCAATAAA AATCAGCAAG TAATGTTGCC TATGTATACT 10320 

AATAATACAA AGTCGTTATC GTTAGATTTG ACGCCACATA CTTACGCTGT TGTTAAAGAA 10380 

AATCCAAATA ATTTACATAT TTTATTGAAT AATTACAGGA CAGATAAGAC AGCTATGTGG 10440 

GCATTATCAG GAAATTTTGA TGCATCAAAA AGTTGGAAGA AAGAAGAATT AGAGTTAGCG . 10500 

AACTGGATAA GCAAAAATTA TTCCATCAAT CCTGTAGATA ATGACTTTAG GACAACAACA 10560 

CTTACATTAA AAGGGCATAC TGGTCATAAA CCTCAGATAA ATATAAGTGG CGATAAAAAT 10620 

CATTATACTT ATACAGAAAA TTGGGATGAG AATACCCATG TTTATACCAT TACGGTTAAT 10680 

CATAATGGAA TGGTAGAGAT GTCTATAAAT ACTGAGGGGA CAGGTCCAGT CTCTTTCCCA 10740 

ACACCAGATA AATTTAATGA TGGTAATTTG AATATAGCAT ATGCAAAACC AACAACACAA 10800 

AGTTCTGTAG ATTACAATGG AGACCCTAAT AGAGCTGTGG ATGGTAACAG AAATGGTAAT 10860 
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TTTAACTCTG 


GTTCGGTAAC 


ACACACTAGG 


GCAGATAATC 


CCTCTTGGTG 


GGAAGTCGAT 


10920 


TTGAAAAAAA 


TGGATAAAGT 


TGGGCTTGTT 


AAAATTTATA 


ATCGCACAGA 


TGCTGAGACT 


10980 


CAACGTCTAT 


CTAATTTTGA 


TGTGATTCTA 


TATGACAATA 


ATAGAAACGA 


AGTTGCTAAG 


11040 


AAACATGTTA 


ATAATTTGTC 


GGGTGAATCT 


GTTAGTCTAG 


ATTTCAAAGA 


AAAAGGAGCA 


11100 


AGGTATATTA 


AAGTTAAATT 


ACTAACGAGT 


GGAGTGCCTT 


TGAGTTTAGC 


AGAAGTAGAG 


11160 


GTTTTTAGAG 


AATCAGATGG 


TAAGCAATCT 


GAAGAGGATA 


TAGATAAAAT 


AACAGAAGAT 


11220 


AAAGTAGTCT 


CTACAAATAA 


GGTAGCTACT 


CAAAGTTCAA 


CCAATTATGA 


GGGTGTAGCT 


11280 


GCTTTAGCAG 


TTGATGGTAA 


TAAAGATGGA 


GATTACGGAC 


ATCATTCGGT 


GACTCATACT 


11340 


AAGGCAGATT 


CTAACGCTTG 


GTGGCAGGTC 


GATCTGGGAG 


AAGAGTTTAC 


GGTTTCTAAA 


11400 


GTTGATATTT 


ATAATAGAAC 


AGATGCCGAA 


CCTCAGCGTT 


TATCTAATTT 


TGATGTTATT 


11460 


TTTCTATCTT 


CATCAGGAGA 


AGAAGTTTTT 


AGAAGACATT 


TTGATAAAGT 


AGTTGATGGT 


11520 


TTGTTATCTT 


TAAAAGTACC 


TTCTGTAGGG 


GCTAAGCTAG 


TCAAAATAGA 


ATTAAAATCA 


11580 


GCAGCTATTC 


CGTTAAGTTT 


AGCGGAAGTT 


GAAGTCTATG 


GTTCAAAGAG 


AACTCCGAAG 


11640 


AAACTTTCTA 


ATATTGCATT 


AACAAAAGAA 


ACTCGACAGA 


GTTCAACGGA 


TTACAATGGT 


11700 


TTTTCTCGTC 


TAGCAGTTGA 


TGGAAATAAA 


AACGGAGATT 


ATGGTCATCA 


TTCAGTGACT 


11760 


CATACCAAAG 


AAGATTCTCG 


TTCATGGTGG 


GAGATAGATT 


TAGCACAAAC 


CGAAGAATTA 


11820 


GAAAAGTTAA 


TTATTTATAA 


TAGAACAGAT 


GCTGAAATTC 


AGAGATTATC 


AAATTTTGAT 


11880 


ATTATTATAT 


ATGATTCAAA 


TGATTATGAA 


GTTTTTACAC 


AAGATATTGA 


CAGTTTAGAA 


11940 


AGCAATAATC 


TATCCATAGA 


CTTAAAAGGA 


CTGAAGGGAA 


AAAAGGTTAG 


AATTTCTTTG 


12000 


AGAAGCGCAG 


GAATTCCTTT 


AAGTTTAGCA 


GAGGTAGAGG 


TTTATACTTA 


TAAGTAATTT 


12060 


TAAAAATTAT 


CACCCAGGCT 


ACCGTAAATA 


TAATGGAGAT 


GGTAGTATGA 


AAGAAACAGA 


12120 


AAAATAAGAG 


GAAAATAGTA 


TGATTCAACA 


TCCACGTATT 


GGGATTCGTC 


CGACTATTGA 


12180 


TGGTCGTCGT 


CAAGGTGTAC 


GCGAATCACT 


TGAAGTGCAA 


ACAATGAAGA 


TGGCTAAAAG 


12240 


TGTGGCAGAT 


TTGATTTCAA 


GCACATTGAA 


ATATCCAGAT 


GGGGAACCTG 


TGGAATGCGT 


12300 


GATTTCTCCA 


TCTACTATTG 


GCCGTGTACC 


AGAGGCTGCA 


GCTTCCCATG 


AGTTGTTTAA 


12360 


AAAATCAAAT 


GTTTGCGCAA 


CAATTACAGT 


TACACCATGC 


TGGTGTTATG 


GTAGTGAAAC 


12420 


TATGGATATG 


TCTCCAGATA 


TTCCTCATGC 


TATTTGGGGA 


TTTAATGGGA 


CAGAACGCCC 


12480 


AGGAGCTGTC 


TATCTTGCAG 


CTGTACTAGC 


TTCACATGCT 


CAAAAAGGGA 


TTCCAGCCTT 


12540 


TGGGATTTAT 


GGAAGAGATG 


TTCAGGAAGC 


TAGTGACACA 


GATATTCCAG 


AAGATGTCAA 


12600 
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AGAAAAACTT TTACGCTATG CGCGTGCAGC TCTTGCAACT GGCTTGATGA GAGACACTGC 12660 

TTACCTATCA ATGGGTAGTG TTTCGATGGG GATTGGTGGT TCTATTGTAA ATCCGGATTT 12720 

CTTCCAAGAA TACTTAGGAA TGCGAAATGA ATCGGTAGAT ATGACGGAGT TCACGCGCCG 12780 

TATGGACCGT GGTATTTACG ACCCTGAAGA GTTCGAACGT GCGCTCAAAT GGGTGAAAGA 12840 

AAACGTAAAA GAAGGATTCG ACCATAACCG TGAAGACCTT GTTTTAAGCC GTGAAGAAAA 12900 

AGATAGACAA TGGGAATTTG TTATTAAGAT GTTCATGATT GGACGTGACT TAATGGTTGG 12960 

TAACCCAAGA CTTGCTGAAC TTGGTTTTGA GGAAGAAGCG GTTGGTCACC ATGCTTTAGT 13020 

AGCTGGTTTC CAAGGTCAAC GTCAGTGGAC AGACCATTTT CCAAATGGGG ACTTTATGGA 13080 

AACTTTCCTC AATACTCAGT TTGACTGGAA TGGTATTCGA AAACCATTTG TATTTGCGAC 13140 

AGAGAATGAT TCACTAAATG GTGTGTCTAT GCTCTTTAAT TATCTATTAA CAAATACTCC 13200 

ACAAATCTTT GCTGATGTGC GTACTTATTG GAGCCCAGAG GCTGTTAAAC GTGTAACGGG 13260 

ACATACTTTA GAGGGTCGTG CTGCAGCTGG CTTCTTACAT CTAATCAACT CTGGTTCTTG 13320 

TACATTGGAT GGTACAGGTC AAGCTACTCG AGATGGCAAA CCTATTATGA AACCATTCTG 13380 

GGAGTTGGAA GAAAGTGAAG TGCAGGCTAT GCTTGAAAAT ACAGACTTCC CACCAGCAAA 13440 

CCGCGAATAC TTCCGTGGAG GAGGATTCTC AACTCGTTTC TTGACGAAGG GGGATATGCC 13500 

AGTAACAATG GTACGTCTCA ATCTTCTAAA AGGGGTTGGT CCAGTGCTAC AAATTGCAGA 13560 

AGGTTACACA CTTGAACTTC CTGAAGATGT TCACCATACT TTAGATAATC GTACAGATCC 13620 

AGGATGGCCA ACTACTTGGT TTGCTCCACG TTTGACAGGA AAAGGTGCTT TCAAGTCTGT 13 680 

CTATGACGTC ATGAATAATT GGGGAGCTAA TCACGGAGGC ATAACATATG GACACATTGG 13740 

AGCAGACTTG ATTACCTTGG CTTCTATGTT GAGAATTCCT GTCAATATGC ATAATGTACC 13800 

TGAGGAAGAT ATCTTTAGAC CTAAAAATTG GTCCTTATTT GGAACAGAAG ATCTAGAATC 13860 

AGCAGACTAT CGTGCATGTC AGTTGTTGGG GCCACTACAT AAATAAAACT TGTTTATATA 13920 

GGAGGTGAAC TTACGTCCCT CCTATCCTTT TAAAAAGATT TGTTAAACAA TTCACAAATA 13980 

ATTGAAAACG AATACAAAAA GTAATATAAT GATGTTAAAT AGATAGCGCG GAGGCGCAGG 14040 

AGGAAAATTA TATGGCTATA TTTTATGTTC CGGCAGTCAA CCTTATTGGA AAAGGTGTTG 14100 

TAAATGAAGT GGGTCCTTAT ATCAAGGAAC TTGGCTATAA AAAGGCACTT TTGGTGACAG 14160 

ATAAGTACAT CGAAGGCAGT GATATTTTAC CTAAGACTTT AAAACCACTG GATACAGAAG 14220 

GAATCGAATA T 14231 
(2) INFORMATION FOR SEQ ID NO: 82: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 16995 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 



AGTTCTCTTA 


ACTTTTTTAG 


GATGGCATTC 


TCCGCTCTCA 


GGTACTCATT TTCTGCTgAA 


60 


GACGTTCTAA 


TTCTGTCCTC 


TCTTCAGGTC 


TCGTTTTTGG 


CTTACGTCCC 


ATTTTAGGTA 


120 


CTCTCCCTCT 


TGTTTTCTCA 


ACAATAGTAT 


ACCCGTTTTT 


CCTGTATTGT 


GCTAGCCAGT 


180 


TAAGAAGTAT 


CGTACGACTT 


GGGAGACCGT 


ATTCAAGAGA 


AACTCTATCT 


TTAGTCCAGC 


240 


CTTCATGTCA 


GACTTTATTA 


CTCATTTCTT 


GTTTTAAATC 


AGGAGAATAG 


TAACGATTTT 


300 


TTCCTTTTTT 


GACGAACTCT 


ATTCCGTAAC 


GATCAATCAA 


TTTAATCATG 


TACCTAATAT 


360 


TAGAATTGCT 


TATCCCAAAT 


TTATTTGAAA 


GCTTCTCTAA 


GCTATATCCT 


TGTTTTCTAA 


420 


GTTCATAGAT 


CTGAACTTTA 


TCATCATAAG 


TTAGTTTCAT 


AATAAAAACA 


CCCCAAAAGT 


460 


TAGATTTTTT 


CTGTCTAACT 


TTTGGGGTGT 


AGTTCATGTA 


CACCTGATAT 


GATGCGTTTT 


540 


ATAATTTTTA AGCCTTTTTG 


CCCAGCCTCG 


TCAAAAGTAA 


TGTTTTGACA 


CAAAATCTGT 


600 


GACAAAACTT 


TAGTTTTAAA 


GGTTTTTAAC 


TTTGTATATA 


CTAGTTTTAA 


GAAAAGGAGG 


660 


ATGATCTAAT 


GGAAGAAAAA 


GTATCATTGA 


AAGTCAGGGT 


TCAAAAACTA 


GGGACATCGC 


720 


TTTCAAATAT 


GGTTATGCCC 


AATATTGGAG 


CATTTATTGC 


TTGGGGAGTA 


TTGACTGCCC 


780 


TCTTTATCGC 


TGATGGCTAT 


CTGCCAAATG 


AACAGTTAGC 


TACTGTTGTT 


GGTCCTATGT 


840 


TAACGTATTT 


ATTGCCAATC 


CTGATTGGTT 


ACACAGGTGG 


ATATATGATC 


CATGGCCAAC 


900 


GTGGTGCCGT 


TGTAGGAGCT 


ATTGCTACTG 


TTGGTGCAAT 


CACAGGTTCT 


AGTGTTCCTA 


960 


TGTTTATCGG 


AGCTATGGTA 


ATGGGCCCAC 


TGGGAGGATG 


GACTATCAAG 


AAATTTGATG 


1020 


AGAAGTTCCA 


GGAAAAAATT 


CGTCCCGGAT 


TTGAAATGTT 


AGTTAATAAC 


TTCTCAGCTG 


1080 


GTCTCGTTGG 


TTTTGCATTA 


TTGCTTTTGG 


CTTTCTACGC 


AATCGGTCCA 


GTCGTATCGA 


1140 


CTCTTACTGG 


AGCTGTTGGG 


AATGGTGTTG 


AGGCTATTGT 


CAATGCTCGC 


CTCCTTCCTA 


1200 


TGGCTAATAT 


TATCATCGAA 


CCGGCTAAAG 


TCCTTTTCCT 


CAATAATGCC 


CTCAATCATG 


1260 


GCATTTTTAC 


TCCTCTGGGA 


GTAGAACAGG 


TAGCTCAAGC 


TGGTAAGTCA 


ATTCTCTTCC 


1320 


TATTGGAAGC 


TAATCCTGGA 


CCAGGTCTGG 


GAATTCTATT 


AGCTTATGCT 


GTATTCGGTA 


1380 


AAGGTTCTGC 


TAAATCTTCT 


TCTTGGGGGG 


CAATGGTTAT 


TCATTTCTTC 


GGAGGGATTC 


1440 


ATGAAATTTA 


CTTTCCTTAT 


GTTATGATGA 


AGCCTACTCT 


ATTTTTAGCT 


GCTATGGCAG 


1500 
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GAGGTATCTC TGGAACTTTT ACTTTTCAAC TCTT AGACGC -TGGTCTTAAA TCTCCAGCTT 


1560 


CACCAGGTTC TATTATTGCG ATTATAGCTA CGGCGCCAAA AGGTGTTTGG 


CCCCATCTAA 


1620 


ATGTTCTTTT AGGTGTTTTA GTGGCAGCAG TTGTTTCTTT CCTTGTAGCA 


GCCCTTATTC 


1680 


TTCATGCAGA CAAGTCAACT GAGGATTCGC TCGAAGCTGC 


TCAGGCGGCT 


ACCCAAGCAG 


1740 


CTAAGGCTCA GTCTAAAGGT CAGTTAGTAT CAACTTCTGT 


TGATGCAGTT 


GTTTCGACAG 


1800 


ACTCAGTGGA AAAAATCATT TTCGCCTGCG ATGCTGGTAT 


GGGAAGCTCT 


GCTATGGGAG 


1860 


CTAGTATTCT TCGAGATAAG GTTAAAAAAG CAGGTCTAGA 


GATTCCAGTA 


TCTAATCAGG 


1920 


CAATCTCAAA TTTGCTTGAT - ACACCAAAAA CATT A ATTfTT 


TACTCAGGAA 


GAACTGACAC 


1980 


CAAG AG CT AA AGACAAGAGT CCAAGTGCTA TT C ATGTTTC 


TGTTGATAAT 


TTCTTAGCGT 


2040 


CCTCTCGTTA TGATGAAATT GTAGCTTCAT TAAffiRRafiT 


TTCTCCAATA GCAGAAATTG 


2100 


AAGGAGATAT ACCAACTTCA GCACCAGTAfi ATAnrrarcr'A 


AAGTGACCTT 


AACCATATTG 


2160 


ATGCTGTAGT AGTTGCTTAT GGTAAAGPAP anvr 


AACTATGGGC 


TGTGAAACGA 


2220 


TTCGGGCTAT TTTTAGAAAC AAGAATATTC GTATTCCAGT 


TTCTACTGCC 


AAAATTTCAG 


2280, 


AATTAGGTGA ATTTAATTCT AAAAACATAA TGATTGTAAC 


AACTATTTCT 


TTACAGGCAG 


2340 


AAGTGCAGCA AGCAGCACCG AATTCTCAAT TTCTTATTGT 


GGATAGTTTA 


GTAACAACAC 


2400 


CAGAATATGA CAAAATGGCT GCTAGAATGT ACAAATAGAA 


CTAGAGGTTT 


CTAAATTACG 


2460 


AATGCTATTA ACCAAACGAG AAGAACAATT ATTGAAGGCT 


TTCCTACATG 


TAGGGAAGCT 


2520 


TTCAATGCAA GATATGACTG AAATCTTACA GGTTTCATCT 


AGAACAATTT 


ATCGAACTTT 


2580 


A*WAf3A'P p P , IV2 21f , Ilf;iT , llf" , /" , & iwvirT'Mifnii mr^r* n » nv«-» n » 


ATAACGAAGC 


ATGGGAAATA 


2640 


CTATATTTTG ACTGGAGAGT TGGATGATTT GCCGACAGAA 


CTTGAAGTGT 


TAGTTGAGTA 


2700 


TAGTCCCCAA GAAAGACAAG AGTTGATTAC CTATCGCCTT 


CTGACTGAGA 


GTGGTTTTGT 


2760 


CACCAATGAA GCATTGCAAG AGTGCACGAA AGTCAGTAAT 


GTAACTATTA 


TTCAGGATAT 


2820 


TTCAGATATT GATAAGCGTC TTTTAGACTT TGATCTGAAA 


ATTGAACGAC 


AAAAAGGTTA 


2880 


TCGGATTTCT GGTGATTCAG TTGGTAAGAG AAGATTTTTG 


GCTATTTTAC 


TGACAAACTG 


2940 


TATCTCAGTA GCAGATTTTT CAACCGGTAA TTTTGGGAGC 


TTTGATATTT 


TAGAAGCAGA 


3000 


TAGAACTGGG CTGGCCAGTC AGATTGTTAA TAAGCAACTG 


TCAGGTTTTC 


CAGATATGGA 


3060 


TGCTAGGATG AAGATGTTTT TTGCGATCTT GTTATCTCTT 


ATAGGTCAGG 


AGCAAAACAT 


3120 


TGAAAATTCA CCTAATACTA GTAAGCAGGC TTTGGAAATT 


TCTCAAAAAA 


TTTTTCAAGC 


3180 


TTACTCTAAG CAGACTGCAC AATTTTATAG TATTCAGGAA 


ATTATCTATT 


TTGCGAGCAT 


3240 


CTTGGATGAA TTAATCATTA AACGTCAGGA CAATCCGCTC 


TTTACGGAGA 


AATTTGATGG 


3300 
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TGAATTTTTC 


TACAATATTT 


CAAATCTGAT 


TGATACGGTT 


TCCATGTATA 


CCAAGATTGA 


3360 


CTTTTTTAAG 


GACAAGGTTT 


TATTCAATTT 


TCTTTTCCAT 


CATATTCGGC 


TCAGTTTAGG 


3420 


CGTCCCTATC 


CTTTTTCAGG 


GTGAAAATTT 


GCCAGAATCT 


ATCCAGATTT 


TAGTTGAAAG 


3480 


GAATAAATTT 


CTTTATACAG 


TCATCAGTCT 


TTTAGTGAAT 


GATATTTTTC 


CGAAATATCT 


3540 


TCATACAGAG 


TATGAGTATG 


GCATGATTGC 


CCTACATTTT 


ATCTCTAGCT 


TAGGCCGTAG 


3600 


TCCAGAGATT 


TATCCAGTCC 


GTGTTTTGCT 


TTTAACGGAT 


GAACGTCGGG 


TCACTAGAGA 


3660 


TTTATTAGTC 


AGTAAAATTA 


AGAGTGTTGC TCCTTTTGTA 


GAGTTGATAG 


ATATTCAATC 


3720 


TCTAGTAGAT 


TACCACAGTA 


TTGATCTCAG 


TCAGTATGAT 


TATATTTTAT 


CTACGAAGCG 


3780 


GCTGACTAAT 


CAGGAAATCG 


ATGTAATTTC 


TAGTTTTCCA 


ACCGTCAAAG 


AATTGCTTGA 


3840 


ATTACAGGAA 


CGACTTCAGT 


ATGTACAGGC 


ACATCGTACA 


ATTGTCGCGC 


GTGATGCTAT 


3900 


CGCTCCAGAG 


AAAAGTTATG 


ACTTGCAAGA 


TTATTTAATA 


TCTAGTAGTC 


AGCTTTTGAG 


3960 


TCAATTCGAG 


TTGGTTCAAT 


TGGAGAATAA 


TCAATCATTT 


GAGCACACGG 


TAGAACAAAT 


4020 


CATCCAATAT 


CAGAAGAATG 


TGAGTGACAG 


AGCTTACCTA 


ACAAGAAAAT 


TGTTATCTCA 


4080 


CTTCCAGAAT 


AGTCCTATGG 


CTATTCCTAA 


TACTGGTCTG 


GTGCTTTTAC 


ATAGTCAGTC 


4140 


TAGCAAAGTA 


ACAACAAATA 


GTTTTACTAT 


GTTTGAACTC 


AAACTACCTA 


TCTCCGCATT 


4200 


GTCAATGAAA 


CGAGAGGAAG 


AAGAGGTCAA 


AAGGTGTCTG 


CTAATGCTAA 


TGTCTAAAGA 


4260 


AGCTAGCGAG 


GAAGCTAGAG 


ATTTAATGAC 


AGCTATTAGT 


CAGTCGATTA 


TTGAAAATCA 


4320 


TCTTTATACA GAGATTTACA AGACGGGAAA TCAATCCATT ATTTATCAGA TGCTAAATAC 


4380 


TATTTTTAAC 


GAAAAAATTA 


AGAAATTGGA 


GAACTAATAT 


GAAACTTGAA 


AAACATTTGA 


4440 


TTAAGCTTAA 


TAAACAATTT 


TCTAACAAGG 


AGGAAGCTAT 


TTGTTATTGT 


GGGCAAGTTC 


4500 


TTTATGAGGG 


TGGATATGTT 


AATGAAGACT 


ATATTGAAGC 


CATGATTGAG 


CGAGATAAAG 


4560 


AGCTATCTGT 


TTACATGGGT 


AACTTTATCG 


CCATACCGCA 


TGGAACAGAT 


GCAGCAAAAA 


4620 


ATGATGTCCT 


CAAGTCTGGT 


ATTACAGTCG 


TTCAAGTCCC 


TAGAGGGGTT 


GATTTTGGGA 


4680 


ATGTATCTAA 


CCCTCAAGTG 


GCAACGGTTC 


TTTTTGGTAT 


TGCTGGTATT 


GGTAATGAAC 


4740 


ACTTAGAAAT 


TATTCAGAAA 


ATTTCTATCT 


TCTGTGCAGA 


TGTAGATAAT 


GTTCTTAAAC 


4800 


TAGCAGATGC 


TCAGTCAAAA 


GAGGAAGTAT 


TGCGCTTATT 


TGATGCTGTT 


GAATAATTGA 


4860 


ATTTAGTCAT 


TTGTCATCTA 


GTATATATGT 


CCCTCAAATA 


GGAAAAGGAG 


AAATTGAATG 


4920 


AAACATTCTG 


TTCATTTTGG 


TGCCGGTAAT 


ATCGGTCGTG 


GTTTTATAGG 


TGAAATTCTA 


4980 


TTTAAAAATG 


GTTTCCATAT 


TGATTTTGTG 


GATGTCAATA 


ATCAGATAAT 


TCATGCTGTG 


5040 
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AATGAAAAGG GCAAGTATGA AATTGAAATT GCACAGAAAG 


GACAGTCTCG 


TATAGAAGTA 


5100 


ACTAATGTGG CTGGCATTAA TAGCAAAGAA CATCCTGAGC AAGTCATTGA AGCGATTCAA 


5160 


AAGACGGATA TTATTACTAC TGCAATCGGA CCTAATATAC 


X^.l>V_ I 1 1 1 A 1 


LCj AACTT . 


5220 


CTAGCCAAAG GAATCGAAGC 


TCGCCGAGTT GCAGGAAATA 




WiA I \j I r ATG 


5280 


GCCTGTGAAA ATATGATTGG 


CGGGTCTCAA TTTCTTTATC 


AnuAAu 1\.AA 


GAAATATTTA 


5340 


AGTCCGGAAG GTTTGACATT 


TGCTGATAAC TACATAGGTT 


TrTCCAAATGC 


TGCAGTAGAC 


5400 


AGGATTGTTC CAGCACAAAG 


TCACGAAGAT TCCCTTTTTG 


TTGTGGTCGA 


GCCCTTTAAT 


5460 


GAATGGGTCG TGGAAACCAA 


GCGTCTTAAA AATCCAGATT 


TACGTCTAAA 


AGATGTGCAT 


5520 


TATGAAGAAG ATTTAGAACC 


CTTTATTGAG CGAAAACTTT 


TTTCAGTCAA 


TTCTGGACAT 


5580 


GCAACTTCAG CTTACATTGG 


TGCGCATTAT GGTGCCAAGA 


CAATTTTGGA 


AGCTCTTCAA 


5640 


AATCCTAATA TTAAATCTCG 


GATTGAATCT GTATTAGCTG 


AAATTCGGAG 


TCTCTTGATT 


5700 


GCCAAATGGA ACTTTGATAA 


AAAAGAATTG GAGAATTATC 


ACAAAGTCAT 


TATAGAACGA 


5760 


CTTGAAAACC CTTTCATAGT 


GGACGAGGTT AGTCGCGTAG 


CTCGTACTCC 


AATCCGAAAA 


5820 


TTAGG CTATA ATGAACGATT CATCCGGCCG ATACGTGAAT 


TGAAAGAACT 


CAGTTTGTCA 


5880 


TATAAAAACC TACTTAAAAC 


AGTTGGCTAT GTCTTTGACT 


ATCGCGATGT 


AAATGATGAA 


5940 


GAAAGTATTC GATTAGGTGA 


ATTGTTGGCT AAACAATCAG 


TCAAAGATGT 


TGTTATACAA 


6000 


GTTACAGGTT TAGACGACCA 


AGAATTGATT GAGCAAATTG 


TAGAGTATAT 


TTAATCTTTT 


- 6060 


TCGAAAATCT CTTCAAATCA 


GGTTAGCATC GCTTTGTCTT 


AGGCATATGT 


TGTTCTATCT 


. 6120 


ACAACCTCAA AGCAGTGCTT 


TGAGCTGACT CCGTCAGTCT 


TATCTGCAAT 


CTCAAAACAC 


6180 


TGTTTGAGTT ATCTGCGGTA ATCTTTCTAG CTTGTCTTTG 


ATTTTTGTTG 


TTATTTATAA 


6240 


GGTAAAAGAA GCTGGACAAA 


AAGTCTTCAA AATCGGGAAA 


AGGCAGCCTA 


TCGGGTGTTC 


6300 


AAAAATCTTG ATAGGATGTC 


CTTTATTATG GAAAGCCTTA 


TTGGATTTTC 


TCCTCAGATT 


6360 


GAGTTTTTGA TCAGCTTTAT 


GAGATAGGTC TTGCTAGAGA 


TGTAGCCCAT 


CATGTTATTT 


6420 


TTATGGACAG TGGGAAAATT GTTGAAAAAA ATAATGCCCA TCAATTCTTT 


AGTCGTCCAA 


6480 


GAGAAGAACG AACCAAGCAA 


TTTTGGAACG AATTCTTTCG 


AATGCGATCT 


ATATAGTAAA 


- 6540 


ATGAAACAAG AACAGGACAA ATCGATCAGG ACAGTCAAAT 


CGATTTCTAA 


AAATGTTTTA 


6600 


GAAGTAGAGG TGTACTATTC 


TAGTTTCAAT CTACTATATA 


ACTGAAAAAT 


TAGATAAATT 


6660 


AGTTTTGGAA AATGACTAAC 


CAAAAGATAT CCAAAGTAGT 


CTAAAATTGT 


CTATACTTTA 


6720 


TGAGTGTTTT AGTTAGGAAA AAGGCTTGTT GTCTATAATT 


GTCTGCATTA 


GTCTAGATTT 


6780 


TATTTATAGA AAATGTTATA 


ATAGACTGTA TTTAAAAAAT 


TTTAAGGAGA 


AATGACAGAA 


6840 
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TGTCTGTATC 


ATTTGAAAAC 


AAAGAAACAA ACCGTGGTGT 


CTTGACTTTC 


ACTATCTCTC 


6900 


AAGACCAAAT 


CAAACCAGAA 


TTGGACCGTG 


TCTTCAAGTC 


AGTGAAGAAA 


TCTCTTAATG 


6960 


TTCCAGGTTT 


CCGTAAAGGT 


CACCTTCCAC 


GCCCTATCTT 


CGACCAAAAA 


TTTGGTGAAG 


7020 


AAGCTCTTTA 


TCAAGATGCA 


ATGAACGCAC 


TTTTGCCAAA 


CGCTTATGAA 


GCAGCTGTAA 


7080 


AAGAAGCTGG 


TCTTGAAGTG 


GTTGCCCAAC 


CAAAAATTGA 


CGTAACTTCA 


ATGGAAAAAG 


7140 


GTCAAGACTG 


GGTTATCACT 


GCTGAAGTCG 


TTACAAAACC 


TGAAGTAAAA 


TTGGGTGACT 


7200 


ACAAAAACCT 


TGAAGTATCA 


GTTGATGTAG 


AAAAAGAAGT 


AACTGACGCT 


GATGTCGAAG 


7260 


AGCGTATCGA 


ACGCGAACGC 


AACAACCTGG 


CTGAATTGGT 


TATCAAGGAA 


GCTGCTGCTG 


7320 


AAAACGGCGA 


CACTGTTGTG 


ATCGACTTCG 


TTGGTTCTAT 


CGACGGTGTT 


GAATTTGACG 


7380 


GTGGAAAAGG 


TGAAAACTTC 


TCACTTGGAC 


TTGGTTCAGG 


TCAATTCATC 


CCTGGTTTCG 


7440 


AAGACCAATT 


GGTAGGTCAC 


TCAGCTGGCG 


AAACCGTTGA 


TGTTATCGTA 


ACATTCCCAG 


7500 


AAGACTACCA 


AGCAGAAGAC 


CTTGCAGGTA 


AAGAAGCTAA 


ATTCGTGACA 


ACTATCCACG 


7560 


AAGTAAAAGC 


TAAAGAAGTT 


CCGGCTCTTG 


ACGATGAACT 


TGCAAAAGAC 


ATTGATGAAG 


7620 


AAGTTGAAAC 


ACTTGCTGAC 


TTGAAAGAAA 


AATACAGCAA 


AGAATTGGCT 


GCTGCTAAAG 


7680 


AAGAAGCTTA 


CAAAGATGCA 


GTTGAAGGTG 


CAGCAATTGA 


TACAGCTGTA 


GAAAATGCTG 


7740 


AAATCGTAGA 


ACTTCCAGAA 


GAAATGATCC 


ATGAAGAAGT 


tcaccgttca 


GTAAATGAAT 


7800 


TCCTTGGGAA 


TTTGCAACGT 


CAAGGGATCA 


ACCCTGACAT 


GTACTTCCAA 


ATCACTGGAA 


7860 


CTACTCAAGA 


AGACCTTCAC 


AACCAATACC 


AAGCAGAAGC 


TGAGTCACGT 


ACTAAGACTA 


7920 


ACCTTGTTAT 


CGAAGCAGTT 


GCCAAAGCTG 


AAGGATTTGA 


TGCTTCAGAA 


GAAGAAATCC 


7980 


AAAAAGAAGT 


TGAGCAATTG 


GCAGCAGACT 


ACAACATGGA 


AGTTGCACAA 


GTTCAAAACT 


8040 


TGCTTTCAGC 


TGACATGTTG 


AAACATGATA 


TCACTATCAA 


AAAAGCTGTT 


GAATTGATCA 


8100 


CAAGCACAGC 


AACAGTAAAA 


TAATCTTAAT 


AAACAGAAAA 


CCCACCTGAA 


TTGGTGGGTT 


8160 


TTCTGATGCA 


CTATTTTCCA 


AAAATCTCTT 


TGAGGTCTGT 


GTCTGTAATC 


CCAATCATGG 


8220 


CTGGGATGCG 


GTCCCAGTTT 


TCTTCGGTTA 


GGATGTAGGA 


TTGTTCAGAG 


GCACTTGATG 


8280 


TGACTGTTTC 


AGAGACAGCT 


TGTTGCTTTT 


CTTCAACATT 


CTCCAGTAGA 


TCACTGAAGC 


8340 


GTTCAATCAG 


ATAGGTTTTT CGGGCAGTTC CGATGTGTTG GGTAGCATAG 


TCGAAGGCTT 


8400 


GTAATTCGCC 


TAGTAAGATG 


AGTTTGCTTT 


TGGCACGTGT 


AATGGCTGTG 


TAGATGAGAT 


8460 


TTCGCTCCAG 


CATACGTCGG 


CTAGCACTAG 


TAATCGGTAG 


GATGACAACT 


GGGAACTCAC 


8520 


TTCCCTGAGA 


CTTATGAATA 


CTCATGGCAT 


AGGCCAAGCG AATCTTGTAC 


CATTCGTTAG 


8580 
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GGGGGTAAGA GACTTCATTA CCATCAAAAT CAATGACAAT CTCGTCTTGT TTCGATTCGG 8640 

TGTATTTACC AGGAATCAGG TCTGTGATAG CTCCTAAATC CCCATTAAAG ACATTGATTT 8700 

CAGCATCGTT AACCAAATGA ATGACCCTGT CTCTCTTACG ATAGTGACAC TGAGGAGCTT 8760 

CAAAACTGAG TTGATCTTTT TGTGGGGGAT TGAGCAGGTC TTGCATGAGC TGATTGATAG 8820 

CATCAATCCC TGCCGTCCCT CGGTACATAG GAGCCAGAAC TTGGATATCA CGGGCGGGAA 8880 

TACCATTTCT GAGGGCGGCA CCTAAGATTT TTTCAATGGT GGCAGGAATA TGGCCACTAG 8940 

CAATTTCAAA GTAGGAACGG TCAGCTTTTT TTTGGGTGAA ATCAGCTGGC AAGATGCCCT 9000 

GTCGAATCTG ACTAGCTAGG GTGACGATGG TTGATTCTTT GCTTTGTCGA TAAATTTTTT 9060 

CCAAGCGAGT CTGAGGAATC AAAGGAATAT GAAGTAGATC CGCTAGAACC TGTCCAGGAC 9120 

TGACAGAAGG TAGCTGATCA CTGTCACCTA CGATGAGGAT CTTACTGTTA GAAGAGATAT 9180 

TGGAGAAGAG TTGATTGGCC AGCCAAGTAT CTACCATAGA GAATTCATCC ACGATGATAA 9240 

AGTCAGCATC TAGGTAATCT TCCAGATGAC TGGTATCATC GTCACCTGTC ATTCCCAAGT 9300 

GGCGATGTAT GGTCGCGCTA GGCAAACCTG TCAATTCATT CATGCGACGA GCAGCTCGAC 9360 

CAGTTGGAGC AGCAAGAAGA ATGGGCAGAT TGCTTTTCTT CCTGAAGTCA AGTCCTTCTA 9420 

AAAGGGCATA AACAGCAATG ATTCCATTGA TAACAGTTGT CTTACCAGTA CCAGGCCCAC 9480 

CTGTCAGGAT AAAGACCTTA TTCTGGATAG CATCACAGAT AGCCTGTTTT TGAATGTTAT 9540 

CATACTCAAT TCCCAGTTCT TGCTCGACAG TAGTGATATG TTTTTGAATG GTTTCTAAAT 9600 

CATGACTCTT CTGTTTTCCT TTTTCAAGGA TACGAACCAA GTGACTGCGG ATGCCTTCCT 9660 

CAGCGAAAAA GAGGCTGTTG TCAAAGATCT TGGTATCAAT CTGCTGAACC TTGTCTTCTT 9720 

CGATCAGGTA GGAGAGCTCT TGGGCAACTT GGCTGGGGTC TAGTTCCACG GGACGGGAAG 9780 

ACTCAAGGAG AGTAAGGGTT TGTTCCAGCA AATCCCGTGC TTCAACATAG GTGTCCCCTG 9840 

TTTCCATACA GGCCTGAAAA AGACTGTGAA CTAGACCGGC GCGGAAGCGT TCAGGAGCCT 9900 

GACTTTCGAT GCCTAGTTCC TCAGCTAGTT GGTCAGCAAT GGTAAAGCCC AAACCCTTGA 9960 

TATCCTCAAC CAGCTGGTAG GGATAATTTT CAACCACATC AAGGGTTTCT TCCTTGTAAA 10020 

AGTCTTGAAT CTGAAAGGCT AGTTTGTTGG GAATGCCGTA GTTGGCTAGT TTGGCCAAAA 10080 

TCATCTCCGT TCCGTAGTTG AGACGGAGAG TGGAGACGAA AGCCTCGCGA TTTTTGGCAG 10140 

AGAGTCCTGC GATGCCTTCT AACTTTTCTG GGTGTTGCAA AATTTCGTCA ATGGTATTTT 10200 

CGCCATAGGT ATCCACGATT TTCTGAGCTG TCTTGAGACC AATCCCCTTG AAATGGCTAC 10260 

TTGAAAAGTA CTTGACCAAG CCCTTACTAG TTGGTTTTGC GCGATCATAA CGACTGATTT 10320 

GCAGTTGTTC TCCATACTTG GAGTGCTGGA CAATTTGCCC CCAAAAAGTA TAGTCTTCGC 10380 
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CCTCAATTAC ATCAGCCATG GTTCCTGTGA CAATGATTTC AAAATCATCA AAATCCTCTG 10440 

CGTCCGTATC GTCGATTTCT AGGAGGAGGA TGCGATAAAA ATTGCTGGGA TTTTCAAAAA 10500 

TAATCCGTTC AATAGTTCCT GAAAAATAAA CTTCCATAAA ATTCCTTTGC ATGAATAGGT 10560 

GAGAGTTGGG ATTGTTTTTA TTTTATACTC TTCGAAAATA TCTTCAAACC ACGTCAGCTT 10620 

CCATCTGCAA CCTCAAAACA GTATTTTGAG CTGACTTCGT CAGTTCTATC CACAACCTCA 10680 

AAACACTGTT TTAAGCAGCC TACGGCTAGC TTCCTAGTTT GTTCTTTGAT TTTCATTGAG 10740 

TATTTGTAAA TAAACAATCA CTTCTCACGA TAGAAGAAGA GGCTGAGATT GGTGATTCTC 10800 

TGCCTCTTAG GTTTCTTAAA ATGTTCCGAT ACGGGTGATT GGCCATAAGC GGAATTTAGC 10860 

TTCCCCTGTG ATATCTTTTG CTTTGAAGGT ACCTACGTGG CGGCTGTCGC TCGAAACCAA 10920 

GCGGTCATCT CCGAGGAGAA GGTATTCTCC TTCTGGAACA GTAAAGCTAA AGTTGGTGTT 10980 

GTAGTTGACA TCAACTGTGA AGGCTTGAGC TTTTTGAGCG ATACTTCTAA AGAAAGTTCC 11040 

TTTATTTCCT TCAAAGCCCT TGCCTGAGTA AGTGCTTTGG AGTTTGTCAT CCTTGAAGCG 11100 

TTTGATATAG TCTGCTAGAT AAGGCTCGTC CGTTTCTTTG TCATTGATGT AGAGTTTATC 11160 

ATTTTCGTAA CGAATGGTGT CGCCAGGCAT TCCAATCACG CGCTTGACGA TGTCCTTATT 11220 

GCCATCTTCC TCATGGGCCA CCACGATATC AAAACGGTCA ATAGGAAGGT GTTTTACAAC 11280 

GAAGAGAATT TCGCCATCCG CTAGGGTCGG ATCCATGGAA TGTCCTTCTA CGCGAACATT 11340 

GCTCCAAAAA AAGATACGAC TTAAAGCTAG TAATGACAGA ATTAGGAGGA ACAATCCCCA 11400 

CTCTTTTAAG AAATTTTTAA ATGAATTCAT AACTTACCTT TCTAAGCGTT TTTTCGCTTT 11460 

TTCAGTGTTT TTAAAGTGCA ATTTGGCGCA GAAGCTGAGT CCCTGCATAC CATAGGCTTG 11520 

CAAAATCTGG CTAGCCACCT TGTCAGAAGC CGTTCCAGCT CCACTTGGGA GCTGATAACC 11580 

CAGTTCTCGT CCCAAATTTT CAAGATTTTC CAGAAAGAGA TCACGCGCAA TGACAGAAGA 11640 

AACTGCGACA GACAAGTATT TGCCCTCAGC CTTTTCTTCT AAGCTGATAG GATTGCTGAA 11700 

ACGATTGGCC TCTTGTGCCA AGTACTTGTC ATAATTTTTA GCACTGGTAA AGGCATCAAT 11760 

CACAATTTTC TCAGGCTGAA CACCTTTTTG AAGGAGGAGA TAGATAGCCT GATTATGGAG 11820 

GGCAACCTTA ACCGAAACAG CGTTGTAGCG GTCTCCGATG ACCTCGTTGT ACTTGCTGGG 11880 

TGAGAGAAGG AGTGCCTGGT GCTGAATTTT TTCCTTGAGA ATAGGAGTAA TCTGACGGAT 11940 

CTTTTGGTCG GTCAGAGTCT TAGAATCCCC CACACCGAGT TTTCGTAAAA AGTCGTGCTG 12000 

GTCAGGTGTG ACAAAGGCAG CCACAACTGC AAGCCCACCA AAGTAGGAAC CATTTCCCAC 12060 

CTCATCTGTC CCAATTAAAG GAAGATTTTG TCCGCTGGTT TGCTCTACAG CTTGATAGCG 12120 
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AAAGAAACTG 


GCGTATTTTT 


CAGCCCCTTC 


ACCCTGAAGC 


AAGATTTTTC 


CAGAAGTATA 


12180 


GATAGAAACC 


GTTGCTTGAG 


GTAGTTTCAA 


AAAGTAGCGG 


ATATAGGGAT 


TCTTGCTAGG 


12240 


AGCCAGACTG 


GTTTGATAGT 


GTTCAAGAAA AGCCTGAATA 


TCCTTTTCGC 


TTGGTGTGAG 


12300 


TGTGATACTT 


GCCATAGTTT 


CTATTGTACC 


ACAAAAGCAG 


TAAAATTTGT 


AAAAACTGAC 


12360 


AAAATTAGCG 


AATTTTGGTA TAATATCQTG AGGTGAATTT TATGGCAAAT CTAAATCGAT 


12420 


TCAAATTTAC 


ATTCGGGAAA 


AAATCGTTAA 


CCTTGACAAG 


CGAACATGAC 


AACCTTTTTA 


12480 


TGGAGGAAAT 


CGCTAAGGTT 


GCGACAGAAA 


AATACCAAGC 


AATTAAAGAA 


CAAATGCCTA 


12540 


GCGCAGATGA 


TGAAACAATC 


GCTCTTTTGT 


TGGCAGTCAA 


CTGTTTATCA 


ACTCAGCTCA 


LA DUU 


GCCGTGAGAT 


TGAATTTGAC 


GATAAGGAGC 


AAGAGCTAGA 


AGAACTCCGT 


CACAAGCTTG 


X£ OOU 


TGACTTGTAA GCAAGAACAG AGCAAGATTG AGGATTCCTT ATGATTTCAT 


TCCTTCTTCT 




ATTGGTCTTG 


GTTTGGGGAT 


TTTATATCGG 


CTATCGGAGA 


GGCCTGCTCT 


TACAGGTTTA 


12780 


TTACCTGATT 


TCAGCCATGG 


CATCGGCTTT 




CAGTTTTATA AGGGGCTTGG 


12840 


AGAGCAATTC 


CATTTATTGC 


TCCCTTATGC 


AAATTCGCAG 


GAAGGTCAGG 


GGACTTTCTT 


12900 


TTTCCCATCG 


GATCAACTCT 


TTCAGCTGGA 


TAAGGTCTTT 


TATGCAGGTA 


TCGGCTACTT 


12 960 


GCTTGTATTT 


GGGATTGTCT 


ATAGCATTGG 


TCGTTTACTT 


GGTCTTCTCT 


TACACTTGAT 


13020 


TCCTAGCAAA AAACTGGGTG 


GTAAGTTGTT 


CCAAGTTTCA 


GCAGGTATCT 


TGTCCATGTT 


i man 


GGTGACCTTA 


TTTGTCTTGC 


AAATGGCCTT 


GACAATCTTG 


GCGACCATCC 


CCATGGCAGT 


13140 


TATACAAAAT 


CCTCTTGAAA 


AGAGTATCGT 


CGCAAAACAC 


ATCATCCAGA 


GCATACCGGT 


1 nnn 


AACAACCAGT 


TGGCTCAAAC 


AAATCTGGGT 


GACAAATTTA 


ATCGGATAAA 


AAGGGCAGGA 




GTTTTCCTAG 


CCCTTTGTTT 


ACAGATTTGA 


CTCGAATCTA 


TCAGAATGTA 


AAAAGCTACC 


13320 


ACACCTAGAC 


ATTCAAAGAC 


AAGGAAATAA 


AGATGAATAA 


GAAAATATTA 


GAAACATTAG 


13380 


AGTTCGATAA 


GGTCAAGGCC 


TTGTTTGAGC 


CTCATTTGTT 


GACCGAGCAG 


GGCTTGGAGC 


13440 


AATTGAGACA 


ACTGGCTCCG 


ACTGCCAAAG 


CAGATAAAAT 


CAAACAGGCT 


TTTGCTGAGA 


13500 


TGAAGGAAAT 


GCAGGCTCTT 


TTCGTCGAGC 


AACCGCATTT 


TACTATTCTC 


TCAACTAAGG 


13560 


AAATTGCAGG 


AGTCTGCAAG 


AGGTTGGAGA 


TGGGAGCGGA 


TCTCAATATC 


GAGGAGTTCC 


13620 


TACTCTTGAA 


ACGCGTGCTT 


CTTGCCAGCC 


GAGAACTTCA 


AAATTTTTAC 


ACCAATCTGG 


13680 


AAAATGTCAG 


CTTGGAAGAA 


TTAGCCCTTT 


GGTTTGAGAA 


ATTACATGAT 


TTTCCGCAAT 


13740 


TACAAGGAAA TCTTCAGGCC 


TTTAATGATG 


CGGGTTTCAT 


TGAAAATTTT 


GCCAGTGAAG 


13800 


AATTGGCGCG 


AATCCGTCGA 


AAAATACATG 


ATAGCGAGAG 


TCAGGTACGC 


GATGTTTTAC 


13860 


AAGACTTGCT 


CAAGCAAAAA 


GCGCAGCTGT 


TGACGGAAGG 


AATTGTTGCT 


AGCAGAAATG 


13920 
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GCCGTCAGGT 


TTTACCAGTC 


AAAAACACCT 


ACCGCAATAA 


GATTGCAGGT 


GTCGTTCATG 


13980 


ATATTTCTGC 


TAGTGGAAAC 


ACCGTCTATA 


TCGAACCCCG 


TGAGGTAGTC 


AAACTGAGCG 


14040 


AAGAAATTGC 


TAGTCTGCGA 


GCAGATGAGC 


GCTATGAAAT 


GCTTCGCATT 


CTCCAAGAAA 


14100 


TTTCTGAGCG 


TGTCCGCCCT 


CATGCGGCTG 


AGATTGCTAA 


TGACGCTTGG ATTATCGGTC 


14160 


ATCTGGACTT 


GATTCGTGCC 


AAGGTTCGAT 


TTATCCAAGA 


AAGACAAGCA GTCGTGCCTC 


14220 


AGCTGTCAGA AAATCAAGAG 


ATTCAACTGC 


TCCATGTCTG 


GCATCCTTTG 


GTCAAAAATG 


14280 


CCGTCGCAAA 


TGATGTCTAT 


TTTGGTCAAG 


ATTTAACAGC 


TATTGTCATT 


ACAGGTCCCA 


14340 


ATACAGGTGG 


GAAGACCATC 


ATGCTCAAAA 


CTCTGGGCTT 


GACACAGGTC 


ATGGGCCAGT 


14400 


CAGGATTGCC 


GATTTTAGCA 


GACAAGGGAA 


GTCGTGTTGG 


TATTTTTGAA 


GAAATCTTTG 


14460 


CTGATATTGG 


AGATGAGCAG 


TCTATTGAGC 


AGAGCTTGTC 


TACCTTCTCT 


AGTCATATGA 


14520 


CCAATATCGT 


GGATATTCTT 


GGCAAGGTCA 


ACCAACATTC 


ACTCTTACTT 


TTGGATGAGT 


14580 


TGGGGGCTGG 


TACTGATCCC 


CAAGAGGGAG 


CAGCCCTTGC 


CATGGCTATT 


CTGGAGGAGC 


14640 


TTCGCCTGCG 


TCAAATCAAG 


ACCATGGCGA 


CGACCCACTA 


TCCAGAACTC 


AAGGCGTACG 


14700 


GTATTGAGAC 


AGCCTTTGTG 


CAAAATGCCA 


GTATGGAGTT 


TGATACTGCA 


ACTCTTCGCG 


14760 


CGACCTATCG 


CTTTATGCAG 


GGTGTTCCTG 


GCCGAAGTAA 


TGCCTTTGAA 


ATTGCCAAAC 


14820 


GTCTAGGCCT 


ATCTGAAGTT 


ATCGTAGGAG 


ATGCCAGTCA 


GCAGATCGAT 


CAGGACAATG 


14880 


ACGTCAATCG 


TATCATTGAG 


CAATTAGAAG 


AGCAGACGCT 


GGAAAGCCGC 


AAACGTTTGG 


14940 


ACAATATCCG 


TGAGGTGGAG 


CAAGAAAATC 


TCAAGATGAA 


CCGTGCGCTA AAAAAACTCT 


15000 


ACAACGAGCT 


TAATCGTGAA 


AAGGAAACCG 


AGCTTAACAA 


GGCGCGTGAA 


CAGGCTGCTG 


15060 


AGATTGTGGA TATGGCCCTA AGTGAAAGTG 


ACCAGATTCT 


CAAAAATCTC 


CACAGTAAAT 


15120 


CCCAACTCAA 


GCCCCACGAA 


ATCATTGAAG 


CCAAGGCCAA 


GTTGAAAAAA 


TTGGCTCCTG 


15180 


AAAAAGTGGA 


CTTGTCTAAA 


AATAAGGTCC 


TTCAAAAGGC 


CAAGAAAAAA 


CGAGCTCGAA 


15240 


AGGTGGGAGA TGATATCGTG 


GTTCTCAGTT 


ATGGTCAGCG 


TGGTACCTTG 


ACCAGTCAAC 


15300 


TCAAGGACGG 


TCGCTGGGAA 


GCCCAAGTTG 


GCTTGATTAA 


GATGACCTTG 


GAAGAGAAAG 


15360 


AGTTTGATCT 


TGTTCAAGCC 


CAGCAAGAAA 


AACCAGTCAA 


GAAGAAACAG 


GTCAATGTTG 


15420 


TGAAACGAAC 


TTCTGGGCGA 


GGACCTCAAG 


CTAGACTGGA 


TCTTCGAGGC 


AAGCGCTATG 


15480 


AAGAAGCCAT 


GAATGAGCTA 


GATACCTTCA 


TCGACCAAGC 


CTTGCTTAAC 


AATATGGCTC 


15540 


AAGTTGATAT 


CATCCATGGT 


ATCGGAACAG 


GAGTCATCCG 


TGAAGGAGTT 


ACCAAATACT 


15600 


TGCAAAGAAA 


CAAACATGTC 


AAGAGTTTCG 


GCTATGCCCC 


ACAAAATGCT 


GGAGGCAGTG 


15660 
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GTGCGACTAT TGTCACTTTT AAAGGATAGC AGTATTCTGG ACTTTATAAA GTAAAAACTG 15720 

TTGAACTAAT TTTTACTAAT AAACACATTG ACAAAAGCCA ACATTTTTTG TAAAATTAGA 15780 

ATCAATTAAA TACCAACACC GAATGAAGTT TAATAGAAGT GGGGAATCGT TTGATTTTCC 15840 

ATGACTGTAA ATGGACGGAA CTCTGGAGAG ACCGTAAAGG CACCGAAGGG CAAGGCAGGC 15900 

AACTGCTCAA ACTCTCAGGT AAAAGGACAG AGCTAGGATA GACCGCTTTT TAGCATTTAT 15960 

CTAAGCATTC CAGAGTACAT GTATCTTGCA TGTGCTCTTT CTTTTGGGGT TGAAACGATA 16020 

GGAGAAGGAA ATGTTAGAAT TGCTTAAATC AATCGATGCT TTTGCTTGGG GACCGCCCCT 16080 

CTTGATTTTA TTGGTCGGAA CAGGGATTTA CCTAACTATT CGGCTAGGAC TCTTGCAGGT 16140 

TTTGCGTCTA CCCAAGGCCT TTCAGCTTAT TTTTATCCAG GATAAGGGAC ATGGTGATGT 16200 

ATCCAGTTTT GCAGCTCTGT GTACAGCCTT GGCATCAACT GTTGGAACAG GAAATATCAT 16260 

AGGAGTTGCG ACGGCTATCA AGGTTGGTGG ACCAGGAGCT CTATTTTGGA TGTGGATGGC 16320 

GGCTTTCTTT GGAATGGCTA CCAAGTATGC GGAAGGACTC TTGGCCATCA AATACCGCAC 16380 

CAAGGACGAC CATGGTGCAG TAGCGGGAGG TCCCATGCAT TATATCCTTC TAGGGATGGG 16440 

AGAAAAGTGG CGACCACTTG CTGTTTTGTT TGCAGTAGCA GGAGTATTGG TTGCTCTCTT 16500 

GGGAATCGGA ACCTTCACCC AAGTCAACTC GATTGCAGAA TCTATCCAAA ATACAACGAC 16560 

GATTTCGCCA GCCATCACAG CTCTCGTCTT GTCTGTGTTT GTAGCGATTG CAGTCTTTGG 16620 

TGGACTCAAG TCTATTTCTA AGGTTTCAAC TACTGTTGTT CCTTTTATGG CCATCATTTA 16680 

TATCTTAGGA ACTCTTACAG TTATTTTCTT TAATATCGGA AAAATCCCTG GCACAATCGC 16740 

TTTAGTCTTT ACCTCAGCTT TTAGTCCCCT TGCTGCGGTA GGTGGATTTG CTGGTGCTAG 16800 

CGTTCGGATG GCTATTCAAA ATGGTGTGGC GCGTGGTGTG TTCTCAAACG AATCTGGTCT 16860 

GGGTTCTGCT CCTATTGCAG CTGCAGCTGC CAAGACAAAT GAACCAGTAG AGCAAGGTTT 16920 

GATTTCCATG ACAGGAACCT TTATTGATAC CCTCATCATT TGTACTCTAA CTGGTTTGAC 16980 

CATCTTGGTA ACTGG 16995 
(2). INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28473 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
CCGGGGCTTT TGTAGTATAA TAGAGATACG TTTTGAAAGT AGGAGGTATC TATGGACTTA 



60 
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ACTAAGCGCT 


TTAATAAACA GTTAGATAAA ATTCAAGTTT CGTTGATTCG 


TCAGTTTGAC 


120 


CAGGCTATTT 


CGGAGATTCC 


TGGGGTCTTG 


CGTTTGACCT 


TGGGGGAACC 


TGATTTTACA 


180 


ACGCCAGACC 


ATGTCAAGGA 


GGCGGGCAAG 


CGAGCGATTG 


ATCAGAACCA 


ATCCTACTAT 


240 


ACAGGGATGA 


GTGGTCTGCT 


GACTCTACGT 


CAGGCAGCCA GTGACTTTGT 


TAAGGAAAAG 


300 


TACCAACTGG 


ACTATGCTCC 


TGAAAATGAA 


ATCTTGGTTA 


CAATTGGGGC 


GACAGAGGCT 


360 


TTATCTGCGA 


CTTTGACGGC 


TATTTTGGAA 


GAGGGAGACA 


AGGTACTTTT 


GCCAGCTCCT 


'420 


GCTTATCCAG 


GCTATGAACC 


GATTGTTAAC 


TTAGTTGGGG 


CAGAAATTGT 


TGAGATTGAT 


480 


ACGACTGAAA 


ATGGTTTTGT 


CTTGACTCCT 


GAGATGTTGG 


AGAAGGCCAT 


TTTGGAGCAG 


540 


GGTGATAAGC 


TCAAGGCGGT 


TATTCTCAAC 


TATCCAGCCA 


ATCCGACAGG 


AATTACCTAC 


600 


AGTCGAGAGC 


AGTTAGAGGC 


CTTGGCAGCT 


GTTTTACGCA 


AGTACGAAAT 


TTTTGTTGTC 


660 


TGTGATGAGG 


TTTACTCAGA 


ATTGACCTAC 


ACAGGCGAAG 


CCATGTGTCT 


CTAGGAACGA 


720 


TGTTGAGAGA 


CCAGGCTATT 


ATTATCAATG 


GTTTGTCTAA 


ATCGCATGCC 


ATGACAGGTT 


780 


GGCGTTTGGG 


GCTGATTTTC 


GCTCCTGCGA 


CCTTCACAGC 


CCAGTTAATC 


AAGAGTCACC 


840 


AGTACTTGGT 


CACTGCCGCA 


AATACCATGG 


CGCAACATGC 


TGCGGTAGAA 


GCCTTGACGG 


900 


CTGGTAAAAA 


CGATGCGGAC 


CCATGAAGAA 


GGAATATATC 


CAACGTCGGG 


ACTATATCAT 


960 


CGAAAAAATG 


ACTGCTCTTG 


GTTTTGAGAT 


TATCAAACCA 


GACGGTGCCT 


TCTATATTTT 


1020 


TGCTAAAATT 


CCAGCGGGCT 


ACAATCAAGA 


CTCCTTTGCT 


TTTCTGAAGG 


ATTTTGCTCA 


1080 


GAAGAAGGCC 


GTTGCCTTTA 


TCCCTGGTGC 


AGCCTTTGGA 


CGTTACGGGG 


AAGGCTACGT 


1140 


CCGCCTATCT 


TATGCAGCCA 


GCATGGAGAC 


TATCAAAGAA 


GCCATGAAAC 


GACTTGAGGA 


1200 


GTACATGAGA 


GAAGCATGAT 


TCAGTCTATC 


ACGAGTCAAG 


GCTTGGTGCT 


TTACAATCGC 


1260 


AATTTTCGTG 


AGGATGACAA 


GCTCGTCAAA 


ATTTTTACAG 


AGCAGGTTGG 


CAAACGCATG 


1320 


TTTTTTGTCA 


AACACGCTGG 


TCAGTCTAAG 


CTGGCGCCTG 


TTATTCAGCC 


CTTGGTGCTG 


1380 


GCACGATTTC 


TCTTGCGAAT 


CAATGATGAC 


GGACTCAGTT 


ACATCGAAGA 


CTATCATGAG 


1440 


GTCATGACTT 


TTCCCAAGAT 


TAATAGTGAC 


CTCTTTGTCA 


TGGCCTATGC 


GACCTATGTG 


1500 


GCAGCTCTTG 


CAGATGCTAG 


TTTGCAGGAC 


AATCAGCAGG 


ATGCTCCCTT 


GTTTGCTTTT 


1560 


TTGCAAAAGA 


CTTTGGAGTT 


GATGGAAGCA 


GGCTTGGATT 


ATCAGGTTTT 


GACCAATATT 


1620 


TTTGAAATTC 


AAATTTTGAC 


TCGATTTGGA 


ATCAGCCTCA ATTTTAATGA 


GTGTGTCTTC 


1680 


TGCCATCGGG 


TTGGTCAGGC 


TTTTGACTTT 


TCTTTCAAAT 


ATGGAGCCTG 


CCTCTGTCCA 


1740 


GAGCATTATC 


ATGAGGATAA 


GAGACGTTGT 


CATCTCAATC 


CCAATATCCC 


CTATCTGCTC 


1800 
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AATCAATTTC AAGCTATTGA TTTTGAGACT TTGGAGACCA TTTCGCTCAA GCCTGGAATC I860 

AAGCAAGAGC TACGCCAATT TATGGATCAA TTATATGAAG AGTACGTTGG GATTCACCTA 1920 

AAATCAAAGA AATTTATTGA TTCCCTAGCA GACTGGGGAC AATTACTAAA AGAGGAAAAG 1980 

AAATGAAAAA AATCGCAGTA GATGCCATGG GGGGCGATTA CGCACCTCAG GCCATTGTTG 2040 

AGGGTGTCAA TCAAGCCCTA TCTGACTTTT CAGATATCGA GGTTCAACTT TACGGAGATG 2100 

AAGCTAAAAT CAAGCAATAT CTGACAGCGA CAGAGCGCGT CAGCATTATC CATACGGATG 2160 

AGAAGATTGA TTCGGATGAT GAACCTACGA GAGCTATTCG GAATAAGAAA AATGCCAGTA 2220 

TGGTATTGGC AGCCAAGGCT GTCAAAGATG GTGAAGCAGA CGCTGTCCTT TCGGCTGGGA 2280 

ATACAGGTGC CTTGTTGGCA GCAGGATTCT TCATCGTGGG TCGTATCAAG AATATCGACC 2340 

GTCCTGGACT CATGTCTACC TTGCCTACCG TTGATGGAAA AGGTTTTGAC ATGCTAGACC 2400 

TTGGTGCCAA TGCAGAAAAT ACAGCCCAGC ACCTCCATCA ATATGCGGTT CTAGGTTCCT 2460 

TCTATGCTAA AAATGTCCGT GGCATTGCGC AACCACGCGT TGGTTTGCTC AACAACGGAA 2520 

CAGAGAGTAG CAAGGGCGAC CCGCTTCGTA AGGAAACTTA TGAATTACTG GCGGCTGATG 2580 

AAAGTTTGAA CTTTATCGGA AACGTGGAAG CGCGTGATTT GATGAATGGC GTTGCAGATG 2640 

TTGTTGTGGC AGATGGTTTC ACGGGAAACG CTGTGCTCAA ATCCATCGAA GGGACAGCTA 2700 

TGGGAATCAT GGGCTTGCTC AAGACAGCTA TTACAGGTGG TGGTCTTCGA GCGAAACTAG 2760 

GTGCCCTCCT TCTCAAGGAC AGCCTCAGTG GTTTGAAAAA ACAGCTCAAT TATTCAGATG 2820 

TTGGTGGAGC GGTCTTGTTT GGTGTTAAGG CACCTGTTGT CAAGACTCAT GGCTCAAGCG 2880 

ATGCCAAGGC TGTTTATAGT ACGATTCGTC AGATCCGTAC CATGCTAGAA ACAGACGTGG 2940 

TTGCCCAGAC TGCGCGTGAA TTTTCAGGAG AATAAAAGAG ATGACAGAAA AAGAAATTTT 3000 

TGACCGTATT GTGACCATTA TCCAAGAGCG ACAGGGAGAG GACTTTGTCG TGACAGAATC 3060 

CTTGAGTCTG AAAGACGATT TGGATGCGGA TTCTGTTGAC TTGATGGAGT TTATCTTGAC 3120 

TCTGGAAGAT GAATTTAGTA TCGAAATCAG CGATGAAGAA ATTGACCAAC TCCAAAACG? 3180 

AGGAGATGTG GTTAAAATCA TTCAAGGAAA ATAGCAATCG GAGTTCCAAG TCAACGGAAG 3240 

TAGATGGTTT TTAGAAATGA GAAATATCGG ACAAGCTGGT AAAATCTTGG CTGACAGTGG 3300 

TTATCAAGGG CTCATGAAGA TATATCCTCA AGCACAAACT CCACGTAAAT CCAGCAAACT 3360 

CAAGCCGCTA ACAGTTGAAG ATAAAGCCTG TAATCATGCG CTATCTAAGG AGATAAGCAA 3420 

GGTTGAGAAT ATCTTTGCCA AAGTAAAAAC GTTTAAAATG TTTTCAACAA CCTATCGAAA 3480 

TCATCGTAAA CGCTTCGGAT TACGAATGAA TTTGATTGCT GGTATTATCA ATCATGAACT 3540 

AGGATTCTAG TTTTGCAGGA AGTCTAATAG TAAAAAAGTG ATTAGAAAAC ATCTTTTTTA 3600 
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AAAATAGAGA 


TGATTTTGAA 


ACAAAAAAGC TAATTCAAGA 


CGTTTCGATG 


CCAATTCAAG 


3660 


ATTTGGATGA 


AAAAAATTAA 


TAGATACTGT TATACTAAAC 


TTGTCAAGTT 


TGTAACAAGA 


3720 


CAAATATTAA AAATAAAAAA GAGGTATTCG TTATGAATAC AAAAACGATG TCACAATTTG 


3780 


AAATTATGGA 


TACTGAGATG 


CTTGCTTGCG TTGAAGGTGG CGGATGCAAT TGGGGAGATT 


3840 


TTGCCAAAGC 


AGGTGTTGGA 


GGAGGAGCAG CACGAGGTCT 


TCAGCTAGGA 


ATTAAAACAA 


3900 


GAACATGGCA 


AGGTGCAGCA 


ACTGGTGCTG TGGGAGGAGC 


TATACTTGGA 


GGTGTGGCCT 


3960 


ATGCAGCGAC 


ATGTTGGTGG 


TAATTATGGA TTTTAAAAGT 


TTTATTATTG 


GTTTAGTAGT 


4020 


TGGTATATTT 


GGTCCTTATA 


TGGATGATTT AATTAGAAAA 


AAATTTTTAA 


AGTCTTCGGA 


4080 


GAAGAAAACA 


GAAAAATCTG 


TTAAAAAATA ATCAAAACTA 


TAAATGATGA 


ATCTGAATCA 


4140 


AAATTATTTT 


GCGCATGTAA 


AGAGGAGTCT TATAGTAACG 


AGTCAAAAAA 


GGAGTAACTA 


4200 


TGAATCGTAA 


TTTAGAACGG 


TGTTATCTAT TCTGACTAGG 


AATAGATCAT 


ACCAGAGGTA 


4260 


GCTTAGAAAT 


AGCAGAGACA 


TTAGAAATTG AAGTAATAAA 


TAGGATGTCG 


TAAGTGTTAC 


4320 


TATCAATGAT 


TTATTTGTTT 


CAAGCTTGCC TAGGGTGACA 


GTAAAAAATG 


AATTTCCTTT 


4380 


CAATAGCATA 


TTTTTAGTGG 


GCAGGACTCT TGTTCTGCCT 




CCAAAAAGTG 


4440 


CAGTTGGGAG 


GGAGATAGGC 


TCATTTGGGA AGGAAGTCCA 


GTTTTTGTTT 


AGTGATTGGG 


4500 


GTAAGATAGT 


TGTTATCAGA TGAGTTAATA CTCTTCGAAA ATCAAATTCA AACCACGTCA 


4560 


ACGTCGCCTT 


GCOGTATATA 


TGTGACTGAC TTCGTCAGTC 


CTATCTACAA 


CCTCAAAACA 


4620 


GTGTTTTGAG 


CAGCCTACGG 


CTAGTTTCCT AGTTTGCTCT 


TTGATTTTCA 


TTGAGTATTA 


4680 


GGGAAAAGGA 


GATGAATATG 


AAATTTGGGA AACGTCATTA 


TCGTCCGCAG 


GTGGATCAGA 


4740 


TGGACTGCGG 


TGTAGCTTCA 


TTAGCCATGG TTTTTGGCTA 


CTATGGTAGT 


TATTATTTTT 


4800 


TGGCTCACTT 


GCGAGAATTG 


GCTAAGACGA CCATGGATGG 


GACGACGGCT 


TTGGGCTTGG 


4860 


TCAAGGTGGC 


AGAGGAGATT 


GGTTTTGAGA CGCGAGCCAT 


TAAGGCAGAT 


ATGAGGCTTT 


4920 


TTGACTTGCC 


GGATTTAACT 


TTTCCTTTTG TTGCCCATGT 


GCTTAAGGAA 


GGGAAATTGG 


4980 


TCCACTACTA 


TGTGGTGACT 


GGGCAGGATA AGGATAGCAT 


TCATATTGCC 


GATCCAGATC 


5040 


CCGGGGTGAA 


GTTGACTAAA 


CTGCCACGTG AGCGTTTTGA 


GGAAGAATGG 


ACAGGAGTGA 


5100 


CTCTTTTTAT 


GGCACCTAGT 


CCAGACTATA AGCCTCATAA 


GGAACAAAAA 


AATGGTCTGC 


5160 


TCTCTTTTAT 


CCCTATATTA 


GTGAAGCAGC GTGGCTTGAT 


TGCCAATATC 


GTTTTGGCAA 


5220 


CACTCTTGGT 


AACCGTGATT 


AACATTGTGG GTTCTTATTA 


TCTGCAGTCT 


ATCATTGATA 


5280 


CCTATGTGCC AGATCAGATG 


CGTTCGACAC TAGGGATTAT 


TTCTATTGGG CTAGTCATCG 


5340 
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TCTACATCTT CCAGCAAATC TTGTCTTACG CTCAGGAGTA TCTCTTGCTT GTTTTGGGGC 5400 
AACGCTTGTC GATTGACGTG ATTTTGTCCT ATATCAAGCA TGTTTTTCAC CTCCCTATGT 5460 
CCTTCTTTGC GACACGCAGG ACAGGGGAGA TCGTGTCTCG TTTTACAGAT GCTAACAGTA 5520 
TCATCGATGC GCTGGCTTCG ACCATCCTTT CGATTTTCCT AGATGTGTCA ACGGTTGTCA 5580 
TTATTTCCCT TGTTCTATTT TCACAAAATA CCAATCTCTT TTTCATGACT TTATTGGCGC ' 5640 

TTCCTATCTA CACAGTGATT ATCTTTGCCT TTATGAAGCC GTTTGAAAAG ATGAATCGGG 5700 

ATACCATGGA AGCCAATGCG GTTCTGTCTT CTTCTATCAT TGAGGACATC AACGGTATTG 57 60 

AGACTATCAA GTCCTTGACC AGTGAAAGTC AGCGTTACCA AAAAATTGAC AAGGAATTTG 5820 

TGGATTATCT GAAGAAATCC TTTACCTATA GTCGAGCAGA GAGTCAGCAA AAGGCTCTGA 5880 

AAAAGGTTGC CCATCTCTTG CTTAATGTCG GCATTCTCTG GATGGGGGCT GTTCTGGTCA 5940 

TGGATGGCAA GATGAGTTTG GGGCAGTTGA TTACCTATAA TACCTTGCTG GTTTACTTTA 6000 

CTAATCCTTT GGAAAATATC ATCAATCTGC AAACCAAGCT TCAGACAGCG CAGGTTGCCA 6060 

ATAACCGTCT AAATGAAGTG TATCTAGTAG CTTCTGAGTT TGAGGAGAAG AAAACAGTTG 6120 

AGGATTTGAG CTTGATGAAG GGAGATATGA CCTTCAAGCA GGTTCATTAC AAGTATGGCT 6180 

ATGGTCGAGA TGTCTTATCG GATATCAATT TAACCGTTCC CCAAGGGTCT AAGGTGGCTT 6240 

TTGTGGGGAT TTCAGGGTCA GGTAAGACGA CTTTGGCCAA GATGATGGTT AATTTTTACG 6300 

ACCCAAGTCA AGGGGAGATT AGTCTGGGTA GTGTCAATCT CAATCAGATT GATAAAAAAG 6360 

CCCTGCGCCA GTACATCAAC TATCTGTCTC AACAGCCCTA TGTCTTTAAC GGAACGATTT 6420 

TGGAGAATCT TCTTTTGGGA GCCAAGGAGG GGACGACACA GGAAGATATC TTACGGGCGG 6480 

TCGAATTGGC AGAGATTCGA GAGGATATCG AGCGCATGCC ACTGAATTAC CAGACAGAAT 6540 

TGACTTCGGA TGGGGCAGGG ATTTCAGGTG GTCAACGTCA GAGAATCGCT TTGGCGCGTG 6600 

CTCTCTTGAC AGATGCGCCG GTCTTGATTT TGGATGAGGC GACTAGCAGT TTGGATATTT 6660 

TGACAGAGAA GCGGATTGTC GATAATCTCA TTGCTTTGGA CAAGACCTTG ATTTTCATTG 6720 

CTCACCGCTT GACTATTGCT GAGCGGACAG AGAAGGTAGT TGTCTTGGAT CAGGGCAAGA 6780 

TTGTCGAAGA AGGAAAGCAT GCTGATTTGC TTGCACAGGG TGGCTTTTAC GCCCATTTGG 6840 

TCAATAGCTA GAAAGAGGAG AGGATGAAAC CAGAATTTTT AGAAAGTGCG GAGTTTTATA 6900 

ATCGTCGTTA CCATAATTTT TCCAGTAGTG TGATTGTACC CATGGCCCTT CTGCTTGTGT 6960 

TTTTACTTGG CTTTGCAACT GTTGCAGAGA AGGAGATGAG TTTGTCCACT AGAGCTACTG 7020 

TCGAACCTAG TCGTATCCTT GCAAATATCC AGTCAACTAG CAACAATCGT ATTCTTGTCA 7080 

ATCATTTGGA AGAAAATAAG CTGGTTAAGA AGGGGGATCT TTTGGTTCAA TACCAAGAAG 7140 
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GGGCAGAGGG 


TGTCCAAGCG GAGTCCTATG CCAGTCAGTT GGACATGCTA AAGGATCAAA 


7200 


AAAAGCAATT 


GGAGTATCTG 


CAAAAGAGCC 


TGCAAGAAGG 


GGAGAACCAC 


TTTCCAGAGG 


7260 


AGGATAAGTT 


TGGCTACCAA 


GCCACCTTTC 


GCGACTACAT 


CAGTCAAGCA GGCAGTCTTA 


7320 


GGGCTAGTAC 


ATCGCAACAA AATGAGACCA TCGCGTCCCA GAATGCAGCA GCTAGCCAAA 


7380 


CCCAAGCCGA 


AATCGGCAAC 


CTCATCAGTC 


AAACAGAGGC 


TAAAATTCGC 


GATTACCAGA 


7440 


CAGCTAAGTC 


AGCTATTGAA 


ACAGGTGCTT 


CCTTGGCCGG 


TCAGAATCTA GCCTACTCTC 


7500 


TTTACCAGTC 


CTACAAGTCT 


CAGGGCGAGG 


AAAATCCCCA 


AACTAAGGTT 


CAGGCAGTTG 


7560 


CACAGGTTGA 


AGCACAGATT 


TCTCAGTTAG 


AATCTAGTCT 


TGCTACTTAC 


CGTGTCCAGT 


7620 


ATGCAGGTTC 


AGGTACCCAG 


CAAGCCTATG 


CGTCAGGGTT 


AAGCAGTCAA 


TTGGAATCCC 


7680 


TTAAATCCCA 


ACACTTGGCA 


AAGGTTGGTC 


AGGAATTGAC 


CCTTCTAGCC 


CAGAAAATTT 


7740 


TGGAGGCAGA 


GTCAGGTAAG 


AAGGTACAGG 


GAAATCTTTT 


AGACAAGGGG 


AAAGTTACGG 


7800 


CGAGTGAGGA 


TGGGGTGCTT 


CATCTTAATC 


CTGAGACCAG 


TGATTCTAGC 


ATGGTTGCAG 


7860 


AAGGTGCCCT 


ACTAGCCCAA 


CTTTATCCAT 


CTTTGGAAAG 


AGAAGGGAAA 


GCCAAACTCA 


7920 


CAGCTTATCT AAGTTCAAAA TATGTAGCAA GAATCAAGGT CGGTGATTCT GTTCGCTATA 


7980 


CTACGACTCA 


TGATGCCGGG 


AATCAACTTT 


TCCTAGATTC 


TACTATTACA 


AGTATTGATG 


8040 


CGACAGCTAC 


TAAGACTGAG 


AAAGGGAATT 


TCTTTAAAAT 


CGAGGCGGAG 


ACTAATCTAA 


8100 


CTTCGGAGCA 


GGCTGAAAAA 


CTTAGGTACG 


GGGTGGAAGG 


CCGCTTGCAG 


ATGATTACGG 


8160 


GCAAGAAAAG 


TTACCTACGT 


TATTATTTGG 


ATCAATTTTT 


GAACAAAGAG 


TAATGTTCGT 


8220 


GTTTTTAGAG 


TTAAATAATT 


TTTAAACTGT 


GAGAAAGATT 


CTTCTTGCAG 


TTTTTTCTTT 


8280 


ACAATTTTTG AAAAACATCT ACTATTTATT 


CGGTTAAATT CTTGTGTTTT TTGGTTTTTT 


8340 


GTGGTAAAAT 


GTGCTCAAGT 


AATACGAAAG 


GCGAACTTTA 


AAATGTCAAA 


ACAATTGATC 


8400 


TATTCGGGAA 


AAGCTAAAGA 


TATCTATACA 


ACTGAGGATG 


AAAATCTTAT 


TATTTCAACT 


8460 


TACAAGGACC 


AGGCGACTGC 


TTTCAACGGT 


GTCAAGAAGG 


AGCAGATTGC 


AGGTAAGGGA 


8520 


GTCTTGAATA 


ATCAGATCTC 


ATCTTTTATT 


TTTGAGAAAT 


TAAATGTGGC 


TGGTGTGGCG 


8580 


ACTCACTTTG 


TGGAGAAACT 


TTCAGACACG 


GAACAACTCA 


ATAAAAAGGT 


TAAGATTATT 


8640 


CCTTTGGAAG 


TCGTGCTCCG 


CAACTATACT 


GCTGGTTCCT 


TTTCAAAACG 


TTTTGGTGTG 


8700 


GATGAGGGAA 


TCGCCTTGGA 


GACTCCGATT 


GTCGAATTTT 


ACTACAAAAA 


TGATGATTTG 


8760 


GATGATCCAT 


TTATCAATGA 


TGAGCATGTG 


AAATTCCTAC 


AGATTGCGGG 


TGACCAGCAG 


8820 


ATTGCCTACT 


TGAAGGAAGA 


AACGCGTCGT 


ATCAATGAAC 


TATTGAAAGT 


CTGGTTTGCT 


8880 
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GAGATTGGGC TTAAATTGAT TGACTTTAAG CTAGAGTTCG GTTTTGACAA GGATGGCAAG 8940 

ATTATCTTGG CAGACGAATT TTCACCAGAT AACTGCCGCT TGTGGGACGC TGATGGCAAC 9000 

CACATGGATA AGGATGTTTT CCGTAGAGGA TTGGGAGAAC TAACCGACGT TTATGAGATT 9060 

GTTTGGGAAA AGTTGCAGGA ATTGAAATAA TCTGTTTGCA ACGGAAAACC TTCGTCTCTC 9120 

AACTAAAAGG ACTCAGGCTG AAAAGGTCCC CCAGACCTTT TCACTCTGTA GAGAACTAGG 9180 

TGAACTAACA GATGTTTACG AAATTGTCTG GGAAAAGTTG CAGGGTTTAA AATAACAACC 9240 

TCAAGGCTGT TTGGGAATAT TGCAAGAGCT GAAATAAAGG AATAAGAATT GATGGATAAA 9300 

CGTATTTTTG TTGAAAAAAA GGCTGATTTT CAGGTCAAGT CAGAGAGTTT GGTTAGAGAG 9360 

CTCCAGCACA ACTTGGGACT GTCAAGCTTG AAAAGTATTC GTATTGTGCA AGTATATGAT 9420 

GTATTTGACT TGGCTGAGGA CTTGTTTGCA CCTGCAGAGA AGCACATTTT CTCTGAGCAG 9480 

GTAACCGACC ATGTTTTAGA TGAAGTATCT GTGCAGGCGG ATCTTGCTAA CTATGCTTTC 9540 

TTTGCCATTG AAAGTCTGCC AGGGCAGTTT GACCAGCGTG CAGCTTCGTC ACAGGAAGCC 9600 

TTGCTTTTGT TGGGAAGTTC GAGTGACGTG ACAGTCAACA CAGCCCAACT TTACTTGGTG 9660 

AATAAAGATA TTGATGCGAC TGAGTTGGAA GCTGTCAAAA ACTACCTGCT CAATCCAGTT 9720 

GATTCTCGTT TCAAGGATAT CACGACAGGG ATTGCCAAGC AGGAGTTTTc" AGAGTCAGAC 9780 

AAGACCATTC CCAAATTGAC TTTCTTTGAA AGCTATGCAG CAGAAGACTT TGCTCGCTAC 9840 

AAGGCCGAAC AAGGGATGGC CATGGAAGTG GATGATTTGC TCTTTATCCA AGACTACTTT 9900 

AAGTCAATCG GGCGCX3TGCC AACTGAGACT GAACTCAAGG TTTTGGACAC TTACTGGTCT 9960 

GACCACTGCC GTCATACGAC TTTTGAGACA GAGTTGAAAC ACATCGACTT TTCAGCTTCT 10020 

AAATTTCAAA AGCAATTGCA GTCAACCTAT GACAAGTATA TTGCCATGCG CGAGGAATTA 10080 

GGTCGGTCTG AAAAACCACA AACCTTGATG GATATGGCGA CTATTTTCGG TCGTTATGAG 10140 

CGTGCTAATG GACGATTGGA TGATATGGAA GTCTCTGACG AAATCAATGC CTGCTCAGTT 10200 

GAAATTGAAG TGGACGTTGA TGGTGTCAAG GAACCTTGGC TCCTCATGTT TAAAAACGAA 10260 

ACCCACAACC ATCCAACAGA AATTGAGCCA TTTGGTGGAG CGGCTACCTG TATTGGTGGA 10320 

GCTATTCGTG ATCCGTTGTC AGGCCGTTCC TATGTTTACC AAGCCATGCG TATTTCAGGT 10380 

GCTGGTGATA TTACAGCACC GATTTCGGAA ACTCGCGCTG GGAAATTGCC ACAACAAGTC 10440 

ATTTCTAAAA CAGCAGCTCA TGGTTATTCT TCATATGGTA ACCAGATTGG GCTTGCAACA 10500 

ACCTACGTTC GTGAATACTT CCACCCAGGC TTTGTAGCTA AACGTATGGA ACTTGGTGCC 10560 

GTTGTTGGTG CGACTCCCAA GGGCAATGTT GTCCGTGAAA AACCTGAAGC AGGTGATGTG 10620 

ATCATCCTTC TCGGAGGCAA AACAGGTCGT GATGGTGTCG GTGGTGCGAC GGGCTCTTCT 10680 
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AAGGTTCAAA CAGTTGAGTC TGTAGAGACT GCTGGTGCTG AGGTTCAAAA AGGAAATGCC 10740 

ATCGAAGAAC GCAAGATTCA GCGCCTCTTC CGTAATGGCA ATGTCACTCG TCTGATCAAG 10800 

AAGTCCAATG ACTTTGGGGC AGGCGGTGTC TGTGTGGCTA TCGGTGAATT GGCAGACGGT 10860 

CTTGAAATCG ACCTCAACAA GGTGCCTCTT AAATACCAGG GCTTGAATGG TACAGAAATT 10920 

GCCATCTCTG AATCACAAGA ACGGATGGCG GTCGTGGTTC GTCCTGAAGA TGTGGATGCC 10980 

TTCGTTGCCG AATGTAACAA AGAAAATATT GATGCTGTTG TGGTGGCGAC AGTAACTGAA 11040 

AAACCAAATC TTGTCATGCA CTGGAATGGT GAGACAATCG TTGACTTGGA GCGTCGTTTC 11100 

CTTGACACCA ATGGTGTGCG CGTGGTTGTC GATGCCAAAG TTGTGGACAA GGATGTCAAA 11160 

CTCCCAGAAG AGCGTCAAAC ATCTGCTGAA ACACTGGAAT CAGATACCCT TACGGTTCTA 11220 

TCTGACCTCA ACCATGCAAG TCAAAAAGGA TTACAGACTA TCTTTGACTG CTCTGTTGGA 11280 

CGCTCAACGG TTAATCACCC ACTTGGTGGT CGTTACCAAC TCACACCAAC TGAGGCATCT 11340 

GTGCAGAAAT TGCCAGTTCA ACACGGTGTG ACTCATACTG CGTCGGTCAT TGCTCAAGGT 11400 

TTCAACCCAT ATGTAGCTGA ATGGTCTCCA TACCACGGTG CTGCTTATGC GGTTATCGAA 11460 

GCAACTGCTC GTTTGGTGGC TGCTGGTGCC AACTGGTTCA AGGCTCGTTT CTCTTACCAA 11520 

GAGTATTTCG AGCGTATGGA TAAACAAGCA GAGCGTTTCG GTCAGCCAGT AGCTGCTCTT 11580 

CTAGGTTCTA TTGAAGCACA AATTCAGCTT GGCTTGCCAT CTATCGGTGG TAAGGACTCC 11640 

ATGTCTGGTA CCTTTGAAGA ATTGACCGTT CCGCCAACCT TGGCTGCCTT TGGGGTGACG 11700 

ACGGCAGATA GCCGTAAGGT GCTCTCTCCA GAATTTAAAG CTGTTGGGGA AAATATCTAC 11760 

TACATCCCAG GTCAAGCCCT CTCTGCAGAG ATTGATTTTG ACTTGATTAA GAAAAATTTT 11820 

GCTCAGTTTG AAGCCATCCA AGCTGACCAT AAAGTGACAT CTGCATCAGC TGTCAAATAG 11880 

GGTGGTGTAG TTGAAAGTTT GGCTCTTGCT ACCTTTGGAA ACTATATTGG TGCAGAGGTG 11940 

ACCTTGCCTG AACTTGAAAC AGCTTTGACA GCTCAATTAG GCGGCTTTGT CTTCACATCT 12000 

CCTGAAGAAA TTGCTGGAGT AGAGAAGGTT GGACAAACGA AAGCAGACTT TACACTGACT 12060 

GTCAACGGTG TGAAGCTAGA TGGACACAAG CTTGACAGTG CATTTCAAGG GACATTGGAA 12120 

GAAGTTTACC CAACAGAATT TACCCAAGCG AAAGAACTAG AAGAAGTACC AGCTGTGGCA 12180 

TCAGATGTTG TGATTAAAGC CAAAGAAAAG GTTGAAAAAC CTGTGGTTTA CATCCCAGTC 12240 

TTTCCAGGAA CCAACTCAGA ATATGATTCA GCTAAGGCCT TCGAAAAAGA AGGTGCAGAG 12300 

GTCAATTTGG TGCCATTCGT GACCTTGAAT GAAGAAGCTA TTGTCAAGTC AGTTGAAACT 12360 

ATGGTTGACA ATATOGACAA GACTAATATT CTCTTCTTTG CTGGTGGATT CTCGGCTGCG 12420 
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GATGAACCAG ATGGTTCAGC TAAGTTTATC GTCAATATCC 


TGCTTAATGA 


AAAAGTGCGT 


12480 


GTGGCTATTG ATAGCTTTAT CGCCCGTGGT GGTTTGATTA TCGGTATTTG TAATGGATTC 


12540 


CAAGCCTTAG TCAAATCGGG TCTCCTACCC TACGGAAACT 


TTGAAGCTGC 


TAACAGTACT 


12600 


AGCCCAACCC TCTTCTACAA TGATGCCAAC CAACACGTGG 


CCAAGATGGT 


GGAAACTCGC 


12660 


ATTGCCAATA CCAACTCACC ATGGTTGGTT GGTGTGCAAG 


TGGGCGATAT 


CCACGCTATT 


12720 


CCTGTTTCGC ACGGTGAAGG GAAGTTTGTC GTGACGGCTG 


AGGAATTTGC 


AGAGCTCCGT 


12780 


GACAATGGAC AAATTTTCAG CCAATACGTT GACTTTAACG 


GTAAACCAAG 


TATGGATTCT 


12840 


AAGTACAATC CGAATGGTTC TGTCCATGCC ATCGAAGGAA 


TTACCAGCAA 


GAATGGTCAA 


12900 


ATCATCGGTA AGATGGGCCA CTCAGAACGT TATGAGGATG 


GTCTTTTCCA 


AAATATCCCA 


12960 


GGCAATAAAG ACCAACACCT GTTCGCATCA GCGGTTAAAC 


ATTTCACTGG 


AAAATAAGAC 


13020 


TTACAGATTT TCTAATAGAT AGTATCAGTA ATGTAAAAGT 


CATGTAAATC 


TAGCTCTTGA 


13080 


TGATTACAAA TGAAAATTAG GTATAAAAAA TGACATACGA 


AGTAAAATCT 


CTTAATGAAG 




AATGTGGTGT TTTCGGTATT TGGGGACATC CAGATGCTGC 


TAAGTTGACC 


TATTTTGGAC 


13200 


TCCACAGTCT TCAACACCGT GGTCAGGAGG GGGCAGGAAT 


CCTCTCCAAT 


GATCAAGGAC 


13260 


AACTGAAGCG CCATCGTGAC ATGGGGCTTT TATCAGAAGT 


TTTCAGAAAT 


CCAGCTAATT 




TGGATAAATT GACAGGAGCT GGTGCGATTG GGCATGTGCG 


TTATGCGACT 


GCTGGCGAAG 


13380 


CTTCTGTAGA TAACATCCAG CCCTTCCTCT TCCGTTTTCA 


CGATATGCAG 


TTTGGTTTGG 


13440 


CTCATAATGG AAATCTGACC AATGCAGCCT CTCTCAAGAA 


AGAACTGGAA 


CAAAGAGGAG 


13 500 


CAATTTTCAG CGCGACTTCG GACTCTGAAA TCTTGGCTCA 


CCTCATTCGT 


CGCAGTCATA 


13560 


ATCCTAGCCT GATGGGCAAA ATCAAGGAAG CGCTCAGCCT 


TGTCAAAGGT 


GGTTTTGCCT 


13620 


ATATCTTGCT GTTTGAGGAC AAGTTGATTG CGGCTCTTGA 


CCCAAATGGA 


TTCCGACCGC 


13680 


TTTCGATTGG TAAAATGGCT AATGGAGCAG TTGTTGTATC 


TTCTGAAACC 


TGTGCTTTTG 


13740 


AGGTCATTGG TGCCGAGTGG ATTCGTGATT TGAAGCCAGG 


TGAGATTGTG 


ATCATTGATG 


13800 


ACGAGGGCAT TCAGTATGAC AGCTATACAG ATGATACCCA 


GTTGGCGGTT 


TGTTCTATGG 


13860 


AGTATATCTA CTTTGCTCGC CCTGATTCTA ATATCCACGG 


TGTCAATGTC 


CATACGGCAC 


13920 


GTAAGAGAAT GGGAGCGCAA TTGGCGCGAG AATTTAAGCA TGAGGCAGAT ATTGTAGTTG 


13980 


GTGTGCCGAA TTCTTCCCTA AGCGCGGCTA TGGGATTTGC GGAAGAATCA GGCTTACCAA 


14040 


ATGAAATGGG TCTGATCAAA AACCAATACA CCCAGCGAAC TTTTATCCAA CCGACTCAAG 


14100 


AATTGCGGGA GCAAGGAGTG CGGATGAAAC TGTCTGCTGT 


TTCGGGTGTT 


GTCAAAGGCA 


14160 


AACGTGTGGT CATGGTGGAT GATTCCATTG TACGTGGAAC 


AACCTCTCGT 


CGTATCGTTC 


14220 
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AGCTCTTGAA AGAAGCGGGT GCGACTGAGG TTCACGTTGC CATTGGAAGT CCTGCACTAG 14280 

CGTATCCATG TTTCTACGGG ATTGATATCC AGACCCGTCA GGAGCTGATT GCAGCCAATC 14340 

ATACGGTCGA AGAAACTCGC CAAATCATTG GTGCGGACAG TCTGACTTAT CTTTCAATTG 14400 

ATGGCTTGAT TGAGTCGATT GGTATCGAAA CAGATGCGCC GAACGGTGGT CTCTGTGTCG 14460 

CTTACTTTGA CGGTGACTAC CCAACGCCTC TTTATGACTA CGAAGAAGAC TATCGTAGAA 14520 

GTTTGGAAGA AAAGACCAGT TTTTACAAGT AGGCGACAGA TTCTCCATTA AAGAAAAGGA 14580 

AAAAATAAAT GACAAATAAA AATGCATATG CCTCACGTCT CACTACTGAC TAAAGGCTTA 14640 

AGCATTTAGT CAGTAGACGC TTTGTCCTAT AGGATCAAAG CTAGAGCCCT GACTAGTATT 1470O 

TTTAGATAAA AAGATGGTTT ATCTAAAAAT ACGTCGCAGT CTTTCTCAAA AAAAGAAAAG 14760 

GAAAAATAAA ATGGCAAATA AAAATGCGTA CGCTCAATCT GGTGTGGATG TTGAAGCGGG 14820 

TTATGAAGTT GTTGAACGGA TTAAAAAGCA CGTGGCCCGT ACGGAGCGTG CAGGTGTCAT 14880 

GGGAGCTCTT GGTGGCTTTG GTGGTATGTT TGACCTTTCC AAGACTGGGG TTAAAGAACC 14940 

CGTCTTGATT TCAGGGACTG ACGGTGTCGG AACCAAGCTC ATGTTGGCTA TCAAGTACGA 15000 

CAAGCACGAT ACCATCGGGC AGGACTGTGT GGCCATGTGT GTCAACGACA TCATTGCTGC 15060 

AGGTGCGGAA CCCCTCTATT TTCTCGACTA CGTAGCGACA GGGAAGAATG AACCAGCTAA 15120 

GCTAGAACAA GTGGTTGCTG GTGTGGCAGA AGGTTGTGTG CAGGCTGGTG CTGCCCTCAT 15180 

CGGTGGGGAA ACGGCTGAAA TGCCGGGCAT GTACGGCGAA GACGACTATG ACTTGGCTGG 15240 

TTTTGCGGTC GGTGTGGCTG AAAAATCTCA AATCATTGAC GGTTCAAAGG TGGTAGAGGG 15300 

AGATGTTCTT CTCGGACTTG CTTCAAGTGG GATTCACTCA AATGGTTACT CTTTGGTTCG 15360 

TCGTGTCTTT GCGGATTACA CAGGTGAGGA AGTCCTACCA GAATTGGAAG GCAAGAAACT 15420 

TAAGGAAGTT CTACTTGAGC CGACTCGTAT CTATGTCAAG GCTGTCTTGC CGCTCATCAA 15480 

AGAAGAGTTG GTCAACGGCA TTGCCCACAT CACAGGTGGT GGCTTTATCG AAAATGTCCG 15540 

TCGTATGTTT GCAGATGACC TAGCTGCTGA AATTGATGAA AGTAAAGTTC CAGTGCTTCC 15600 

AATTTTCAAA ACCCTTGAAA AATACGGTCA GATTAAACAC GAAGAAATGT TTGAAATCTT 15660 

CAATATGGGT GTGGGACTTA TGTTGGCGGT CAGCCCTGAA AATGTAGAGC GTGTAAAAGA 15720 

ATTGTTGGAT GAAGCAGTCT ATGAAATTGG TCGCATCGTC AAGAAAGAAA ACGAAAGTGT 15780 

CATTATCAAA TGAAAAAAAT AGCGGTTTTT GCCTCTGGTA ATGGCTCAAA TTTTCAGGTG 15840 

ATTGCCGAAG AATTTCCAGT GGAGTTTGTC TTTTCAGACC ATCGTGATGC CTATGTGCTT 15900 

GAGCGTGCAA AGCAGCTCGG CGTTCTGTCC TATGCTTTTG AACTCAAGGA GTTTGAGAGC 15960 
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AAGGCAGACT 


ACGAAGCAGC 


CCTTGTCGAA 


CTCTTGGAAG 


AACACCAGAT 


TGACTTGGTT 


16020 


TGCCTAGCAG GCTACATGAA AATCGTTGGA CCAACCTTAT 


TGTCGGCTTA 


TGAAGGTCGG 


16080 


ATTGTCAACA TTCATCCAGC CTACTTGCCA GAATTTCCAG GAGCTCATGG GATTGAGGAT 


16140 


GCTTGGAATG 


CTGGCGTGGG 


TCAGTCTGGT GTGACCATTC 


ACTGGGTGGA 


TTCGGGTGTG 


16200 


GATACAGGCC 


AGGTCATCAA 


ACAGGTTCGT 


GTGCCACGAC 


TAGCTGATGA 


TACCATTGAC 


16260 


AGATTTGAAG 


CTCGCATCCA 


TGAAGCAGAG 


TACAGGCTGT 


ATCCGGAAGT 


AGTGAAGGCT 


16320 


CTATTTACAG 


ATTGACTTTT 


TGATGATTCA 


TATGATATCT 


TTGATTTTAA 


ATTGGAGTCA 


16380 


GTGTTTGTTG 


AAGACGGCTT 


CAAACGGAGG 


TATTTGTAAT 


GTTAGAATCT 


AAAAAAACAA 


16440 


CTCGATATGT 


ATTTTATGTC 


TATCTGATGT 


TATTAACTTG 


GGGAATCTTA 


TTTAAGTTTG 


16500 


AAACAAATCC 


TGAATTTATA 


GCATTTTTCT 


TAGCTCCAAG 


GTATATCAAT 


TGGATTCCAT 


16560 


TTTCAGAACC 


ACTAATAGTC 


GATGGAAAAA 


TTGTTTTTGC 


TGAAATGTTA 


TTTAATCTGA 


16620 


TTTTCTTTAT 


TCCATTAGGT 


GTTTGTTTCC 


CTTTGATAAA 


AACTAATTTA 


TCTAGTTTAA 


16680 


GAATAGTCGG 


GACAGGTTTC 


TTGATTAGTT 


TATTGTTTGA 


GTGCTTACAG 


TATATTTTAG 


16740 


CAATAGGTAT 


AACAGATATA 


ACGGATTTGA 


CTTTAAATAC 


GCTAGGTGTC 


TGTGTAGGCT 


16800 


TACTGATTTA 


TCAAATTTTT 


ATAAGAGTGT 


TCAAATCACA 


GACTAGAAAA 


TGGATCAATA 


16860 


TCTTAGGTAT 


GCTTAGCCTT 


GGTTTTGCTT 


ATCTTGTTTT 


ACTGTTACTG 


CATTTACTTA 


16920 


GTGTTTAACT 


AATGATTAAA 


AAGGAGAATA 


TAATGACTAA 


ACGCGTCTTA 


ATCAGCGTCT 


16980 


CAGACAAAGC 


GGGCATTGTT 


GAATTTGCCC 


AAGAACTCAA 


AAAACTTGGT 


TGGGAGATTA 


17040 


TCTCAACAGG 


TGGAACTAAG 


GTTGCCCTTG 


ATAATGCTGG 


GGTGGATACC 


ATTGCTATCG 


17100 


ATGATGTGAC 


TGGTTTCCCA 


GAAATGATGG 


ACGGTCGTGT 


GAAGACCCTC 


CACCCAAATA 


17160 


TCCACGGAGG 


GCTTCTCGCT 


CGTCGTGACT 


TGGATAGCCA 


CTTGGAAGCG 


GCTAAGGACA 


17220 


ACAAGATTGA 


GCTCATTGAC 


CTTGTGGTGG 


TCAACCTTTA 


CCCATTTAAG 


GAAACTATCC 


17280 


TTAAACCAGA 


TGTGACTTAT 


GCTGATGCAG 


TTGAAAATAT 


CGATATTGGT 


GGGCCATCTA 


17340 


TGCTTCGTTC 


AGCAGCGAAA 


AATCATGCCA 


GTGTTACAGT 


TGTGGTAGAT 


CCTGCTGACT 


17400 


ACGCTGTGGT 


TTTGGATGAA 


TTGGCAGCAA 


ACGGCGAAAC 


CTCTTATGAA 


ACTCGCCAAC 


17460 


GTTTAGCAGC 


CAAAGTATTT 


CGTCACACAG 


CGGCTTATGA 


CGCCTTGATT 


GCAGAATACT 


17520 


TCACAGCTCA AGTGGGTGAA AGCAAGCCTG AAAAACTCAC 


TTTGACTTAT 


GACCTCAAGC 


17580 


AACCAATGCG 


TTACGGTGAG 


AATCCTCAAC 


AAGACGCGGA 


CTTTTACCAG 


AAAGCTTTGC 


17640 


CTACAGACTA 


CTCCATTGCT 


TCAGCCAAAC 


AGCTCAACGG 


GAAAGAATTG 


TCATTTAATA 


17700 


ATATCCGTGA 


TGCAGATGCT 


GCTATCCGTA 


TCATCCGTGA 


CTTCAAAGAT 


AGTCCAACCG 


17760 
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TTGTGGCTCT CAAACACATG AATCCATGTG GAATTGGTCA AGCTGATGAC ATCGAGACTG 17820 

CTTGGGACTA CGCTTATGAG TCTGACCCAG TGTCTATCTT TGGTGGGATT GTCGTCCTCA 17880 

ACCGTGAGGT GGATGCTGCG ACAGCTGAGA AGATGCACGG CGTTTTCCTC GAAATCATCA 17940 

TTGCACCAAG CTATACGGAT GAAGCGCTAG CCATTTTGAT CAATAAAAAG AAAAACTTGC 18000 

GTATCCTTGC CTTGCCATTT AATGCTCAAG AGGCTAGCGA AGTGGAAGCA GAATACACAG 18060 

GTGTAGTCGG TGGACTTCTC GTGCAAAATC AAGACGTGGT CAAGGAAAGC CCAGCTGACT 18120 

GGCAAGTGGT GACTAAACGT CAGCCAACTG AGACAGAAGC GACTGCTCTT GAGTTCGCTT 18180 

GGAAGGCTAT CAAGTACGTC AAATCAAATG GTATTATCGT GACCAACGAC CACATGACAC 18240 

TTGGTGTTGG TCCAGGTCAA ACCAACCGTG TGGCTTCTGT TCGCCTTGCG ATTGACCAAG 18300 

CCAAAGATCG TCTGGACGGG GCGGTCCTTG CTTCAGATGC CTTCTTCCCA TTTGCGGATA 18360 

ACGTGGAAGA AATCGCCAAA GCAGGAATTA AGGCCATCAT CCAGCCCGGT GGCTCTGTCC 18420 

GTGACCAAGA ATCCATCGAA GCAGCGGATA AATACGGCTT GACTATGGTC TTTACAGGTG 18480 

TGAGACATTT TAGACATTAA GAAGATAAAA GGGAAGAAAA GAGTTTCTTT CCTTTTTTGG 18540 

CTTAAAATAC TAACTGAAAC AAGATTAAAA CGAACTTTTT TGATATAATG TTGGTAAATA 18600 

ATTCGCAAAA GAGGTTGAGG AATGAAACTG CTTGTTGTCG GTTCTGGTGG TCGTGAGCAT 18660 

GCGATTGCTA AAAAGTTACT TGAATCAAAA GACGTGGAAA AAGTCTTTGT AGCTCCTGGG 18720 

AATGATGGGA TGACTCTGGA TGGTTTGGAA TTGGTAAATA TCTCTATTTC CGAACATTAT 18780 

AAATTGATTG ACTTCGCAAA GACCAATGAT GTTGCTTGGA CCTTTATCGG TCCAGATGAT 18840 

GCCCTTGCTG CTGGTATCGT GGATGATTTT AACCAAGCTG GACTTAAGGC CTTTGGTCCG 18900 

ACTAGGGCTG CAGCGGAGCT GGAGTGGTCC AAGGATTTCG CCAAGGAAAT CATGGTCAAA 18960 

TACGGCGTTC CGACAGCAAC ATATGGCACA TTTTCAGATT TCGAGGAAGC CAAAGCCTAT 19020 

ATCGAAAAGC ATGGTGCGCC TATCGTAGTC AAGGCGGATG GCTTGGCACT TGGGAAGGGT 19080 

GTCGTCGTTG CTGAGACGGT TGAGCAAGCG GTCGAAGCCG CTCATGAGAT GCTTTTGGAC 19140 

AATAAATTTG GTGACTCAGG TGCGCGCGTG GTTATTGAGG AATTCCTTGA AGGAGAGGAA 19200 

TTTTCACTCT TTGCCTTTGT CAATGGTGAT AAGTTCTACA TCATGCCAAC GGCTCAGGAC 19260 

CACAAACGTG CCTATGATGG CGACAAAGGG CCTAACACGG GTGGTATGGG TGCCTATGCG 19320 

CCAGTCCCAC ACTTACCACA GAGTGTAGTT GATACAGCGG TTGACACCAT TGTCAAGCCA 19380 

GTTCTAGAAG GGGTGATTAA AGAAGGTCGC CCTTATCTGG GAGTTCTTTA CGCAGGGCTT 19440 

ATCCTGACAG CTGATGGACC GAAAGTCATT GAGTTCAACG CTCGGTTCGG AGATCCAGAA 19500 
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ACTCAGATTA TCTTGCCTCG CTTGACCTCT GACTTTGCTC AAAATATCAC AGATATCCTG 19560 

GATAGCAAGG AGCCAAATAT CATGTGGACG GACAAGGGTG TGACTCTGGG TGTGGTTGTC 19620 

GCATCCAAGG GCTACCCGCT AGACTATGAA AGGGGCGTTG AGTTGCCAGC CAAGACAGAA 19680 

GGCGATGTCA TCACCTACTA TGCAGGGGCT AAGTTTGCGG AAAATAGCAG AGCACTGCTC 19740 

TCAAACGGCG GACGAGTTTA TATGCTCGTT ACCACAGCAG ATACCGTCAA AGAAGCCCAA 19800 

GCCAGCATAT ACCAAGAACT ATACCAACAA AAAATAGAAG GACTCTTCTA CCGAACAGAT 19860 

ATCGGAAGCA AGGCAATTAA GTAAAGATAT AAGAATAACG CGCCGTAGTC GCCAAACACG 19920 

ATAATGGTCG TCGTGGTGAA AAGACCAGAA CAGTGAATGT TCTGGTCAGG GGGAAACTTG 19980 

GAGACCTTAG GCTCAAAGTT TAGGAATGAA ACCGAAGGTT TGCTTCCGCC TCCATCACCT 20040 

AAGACCATTA TCAAAAAGAA AAATAAAAAT TCACAAAATA CGTTAATGAT CGTATGGTTT 20100 

GCGAGCGTTA GCGAGCTAAT ATAGAACAAT CACCGCCGTT GTGAAAGAAC GATTGGATGA 20160 

TAATCCAATC GTTCAGGGAA ATTGGAAGAC CTTGGGTTTC CAATTTAGGC ATGAGACACC 20220 

TTTGGTGGCT GCTGCCGTCC CTCACAAGCT AAGGTGATTG TTGAAAAAGA GGAAAAAGGA 20280 

GAAGAAATGA AACCAGTAAT TTCCATCATC ATGGGCTCAA AATCCGACTG GGCAACCATG 20340 

CAAAAAACAG CAGAAGTCCT AGACCGCTTC GGTGTAGCCT ACGAAAAGAA AGTTGTTTCC 20400 

GCACACCGTA CACCAGACCT CATGTTCAAA CATGCAGAAG AAGCCCGTAG TCGTGGCATC 20460 

AAGATCATCA TCGCAGGTGC TGGTGGCGCA GCGCATTTGC CAGGCATGGT AGCTGCCAAA 20520 

ACAACCCTTC CAGTGATTGG TGTGCCAGTC AAGTCTCGTG CTCTTAGTGG AGTGGATTCA 20580 

CTCTATTCTA TCGTTCAGAT GCCGGGTGGG GTGCCTGTTG CGACCATGGC TATCGGTGAA 20640 

GCTGGAGCGA CTAACGCAGC TCTCTTTGCC CTCCGTCTCC TCTCTGTAGA AGATAAGTCC 20700 

ATTGCGGATG CACTTGCCAA CTTTGCTGAA GAACAAGGAA AAATCGCAGA GGAGTCGTCA 20760 

AATGAGCTCA TCTAAAACAA TCGGAATTAT CGGTGGCGGT CAACTGGGTC AGATGATGGC 20820 

CATTTCTGCT ATCTACATGG GCCACAAGGT TATCGCGCTG GATCCTGCGG CGGATTGCCC 20880 

GGTCTCTCGT GTGGCGGAAA TCATTGTGGC ACCTTATAAC GATGTAGACG CCCTCCGTCA 20940 

GTTGGCAGAC CGTTGCGATG TCCTCACTTA TGAGTTTGAA AATGTCGACG CTGACGGTTT 21000 

GGATGCCGTT ATCAAGGATG GACAACTCCC TCAAGGAACA GATCTGCTCC GCATTTCGCA 21060 

AAATCGTATT TTTGAAAAGG ACTTTTTGTC AAACAAGGCT CAAGTCACTG TGGCACCCTA 21120 

CAAGGTCGTG ACTTCTAGCC TAGACTTGGC AGATATCGAC TTGTCGAAAA ACTATGTCCT 21180 

CAAGACTGCG ACTGGTGGCT ACGATGGTCA TGGACAAAAG GTTATTCGTT CAGAAGCAGA 21240 

CTTGGAAGCA GCCTATGCGC TAGCAGACTC AGCAGACTGC GTCTTGGAAG AATTTGTCAA 21300 



WO 98/18931 



PCT/US97/19588 



673 

CTTTGACCTT GAGATTTCTG TCATCGTGTC AGGAAATGGC AAGGAGGTGA CGTTTTTCCC 21360 

AGTTCAGGAA AATATCCACC GCAACAATAT CCTGTCTAAG ACCATCGTAC CAGCCCGCAT 21420 

TTCTGAAAGT CTAGTAGACA AGGCTAAAGC TATGGCAGTG CGAATCGCAG AACAACTCAA 21480 

CTTGTCTGGA ACTCTCTGTG TGGAAATGTT TGCGACAGCT GATGACATCA TTGTCAATGA 21540 

AATCGCCCCA CGACCACATA ACTCTGGGCA CTATTCTATT GAAGCCTGTG ATTTCTCTCA 21600 

GTTTGACACC CATATTCTGG GTGTTCTCGG AGCACCATTA CCAGTCATCA AACTCCATGC 21660 

GCCAGCCGTT ATGCTTAATG TCCTCGGTCA GCATGTCGAG GCTGCTGAAA AATATGTCAC 21720 

AGAAAATCCA AGCGCCCACC TCCACATGTA TGGTAAAATA GAAGCAAAGG ATAATGGTAA 21780 

GATGGGACAT GTGACTTTGT TTAGTGATGT GCCGGATAGT GTGGAAGAGT TTGGGGAAGG 21840 

GATTGATTTT TAGGACAAGT CTATGATACA AATTATCGTT AATACATTTA TTGAAAAGTA 21900 

TAAGACTGGA GCAGTTGTTG AAGTGTTGTA TGCCAGTGCT GACCAAGATA AGGTACAAGC 21960 

TAAATATGAA GAACTAGCTG CACAATACCC CGAAAATTAT TTAGCTATCT ATAATGTACC 22020 

GCTGGATACG GATTTGAATA CACTAGATCA TTACCCGTCT GTGTTTATTG GAAAAGAGGA 22080 

GTTTGAGTAG AAATCTTGGT TTACCTAGAT AGCTTATTCC CAACAGCTTA AGAAGAAAGG 22140 

AAAAATTAAC ACATGATCAA CCGTTACTCT CGCCCTGAGA TGGCGAATAT TTGGAGTGAA 22200 

GAAAATAAAT ACCGTGCTTG GCTTGAGGTG GAAATCCTCT CTGACGAGGC ATGGGCTGAG 22260 

TTGGGGGAAA TCCCTAAGGA AGATGTGGCT TTGATTCGCA AGAAGGCGGA CTTTGACATC 22320 

GACCGTATTT TGGAAATTGA GCAGGAGACG CGCCACGATG TGGTGGGTTT CACGCGTGCG 22380 

GTTTCTGAGA CTCTTGGTGA AGAGCGCAAG TGGGTTCACT ATGGGTTAAC TTCTACTGAC 22440 

GTGGTGGATA CTGCTTATGG TTACCTCTAC AAGCAGGCCA ACGACATCAT CCGTCGTGAC 22500 

CTTGAAAACT TCACTAATAT CATCGCTGAC AAGGCCAAGG AGCACAAGTT CACCATCATG 22560 

ATGGGGCGTA CTCATGGTGT GCACGCTGAG CCGACAACCT TTGGTCTTAA ATTAGCAACT 22620 

TGGTACAGCG AAATGAAACG CAATATCGAG CGCTTCGAGC ATGCGGGTGC TGGTGTAGAA 22680 

GCTGGTAAGA TTTCTGGTGC GGTTGGGAAC TTTGCCAATA TCCCACCATT TGTAGAGGAG 22740 

TATGTCTGCG ATAAACTTGG CATCCGTGCC CAAGAAATCT CTACACAAGT CCTTCCTCGT 22800 

GACCTTCAOG CTGAGTACTT TGCGGTTCTT GCCAGCATTG CGACTTCAAT CGAACGTATG 22860 

GCGACTGAGA TTCGTGGTCT ACAAAAATCT GAGCAACGCG AAGTAGAAGA GTTCTTTGCT 22920 

AAAGGGCAAA AAGGGTCTTC AGCAATGCCT CACAAACGCA ACCCAATCGG TTCTGAAAAT 22980 

ATGACTGGTC TGGCGCGTGT CATTCGTGGT CACATGATTA CGGCTTATGA AAACGTCGCT 23040 
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CTCTGGCATG AACGCGATAT TTCTCACTCA TCAGCTGAGC GTATCATCAC ACCAGATACG 23100 

ACCATTTTGA TTGACTACAT GCTCAACCGT TTTGGAAATA TCGTCAAGAA CTTGACAGTC 23160 

TTCCCAGAAA ATATGATCCG AAACATGAAC TCGACTTTTG GTCTTATCTT TAGCCAACGG 23220 

GCTATGTTGA CATTGATTGA AAAAGGCATG ACCCGTGAGC AAGCCTATGA CTTGGTGCAA 23280 

CAAAAACAGC CTACTCTTGG GACAACCAAG TAGACTTTAA ACCACTTCTT GAGGCAGATT 23340 

CAGAAGTAAC ATCACGTCTC ACACAAGAAG AAATCGATGA AATCTTCAAC CCAGTTTATT 23400 

ACACCAAACG AGTGGATGAT ATCTTTGAAC GTCTTGGACT AGGTGATTAA TTAAAAAATA 23460 

AACAGCGAGC TTCAATCTCG CTGTTTATTT TTTATCGAAA AGACTTAGTC TTCTTTTCTT 23520 

TTAGTGAGTC CATAGGCTGC TAGTGTGGAC ATGAGTCCTG CGACTACTAG TCCTGCAGAA 23580 

TCGTGAGTTC CTGTTTCAGG AAGTTTTTTC TCTGTTACCA CAGGAGCTGG ATCTTGAGGA 23640 

AGAACTTTGC TTTCCTCAGC AGGAGCAGTT GATGGAGCTG GTTGGCTTGG GATTTCTAGT 23700 

TTTGGTTTTT CTTCAGCAAT AGCGGCTTGT CCGTTTTCAT CGCCTACATG TGTTACCATA 2 3760 

GTTCCGACTT CGACTATTTG AGTAACGGCT TCCTGTGCTA CGACACTATT TACAAGTGTT 23820 

TTCACTTCCT TACCATCGGC AGAAGTGCTC ACAGAGTAGA AGTTGCTACG ATGTCCATTG 23880 

ACGCCCTTAG TAATGACTTG TGTTTTTCCT TTGAGTAAGA GTGGATTTTC ACAAGTCACT 23940 

GTGGTAAATG GAATTTCTTC TTCTTGGATA TCCAGTCTAG GTTTTACCTC AGTAGTTGGT 24000 

GCAAGACCAC TTTCATCACC CTTGTGAGTT ACAGGAGCGC CAACTTCAAC CACTTGGTTT 24060 

ATAACTTCTT TGGTTACCTG GCTATCAAGG ACTGTTTCTG TTGTTTTTCC ATTTTCAGTG 24120 

AGTACAGAGA TGTAATGAGT TCGTTCACCT TTGACTCCTG CTGTGATAAT ATTTTCCTGA 24180 

CCGGCTGGGA GGTTAGGATT TTCTTTCTTG ATAACTTCAA ATGGAATTTC TTCAGTTCTT 24240 

GTGATGAGTT CTGGTCTGGT TTCAACATTG GCAGCCACTT CATTTTCATC TAGGCTTCCT 24300 

GAATGAGTTA CAGCTGGTTT GAGGCCTTGA AGAGCGGCTT TTAGGTTGGC TACAAGCGTG 24360 

TCAAGCTCAG CTTGTTTATT ACGGTTGAGG TTGTAATTTA GAGCTGTTTT AGCTGCGTCA 24420 

AGGGCCTCAA GACTTTCTTT ACTATATCCT TCTAAGTTTG TAGGAATTTT AGCTAATTCT 24480 

TCGCGGAGAG CATTATAATT AGCACGAAAG TAGTCTTTGT TGTGGTCTGC AAAGGCAGTC 24540 

ATGAGTTCAA AGATTTCCTC TTCCTTGTAT TCAGCGCTTG GTCTATCTGC CCAGATTGAA 24600 

AGCATACTTC CGACTGTTGG AAGATCTACT TCAGGATATT TGGTAGAAGC TAGTTGATTG 24660 

AATGGTGTTT TTCCAGTATT CTCAATAGCT TTCTTGAGGA AACCACCACC ATCTTCTGGT 24720 

TTTTGACCAA GAATGTAGTA CCAGTCACCG TTGGTATTCA AGAATTTATA GCCTTTGCTT 24780 

GCTAGGTATT GAGGTGATGC GAGGTTATAT CCCCACCAGC CTTTAGACCA GTAAGAAATC 24840 
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AAGACATCTT 


TGTCAAACTG 


AACATCGTCC 


TTGTCTTCAT 


AGTAGAAGCC 


ATCGTTGAAG 


24900 


GCCATTGGTT 


GAAGCCCTCT 


TTCTTTGGCC 


ATAGCTGCGA 


GGGTGTTGGC 


ATATTCGGCA 


24960 


AATTTGCCAT 


AGAGTTGATA 


CCACTTGAGG 


TAGTACCAGC 


CTTGGGCACT 


AGTCGCATCG 


25020 


TTGGCGTATT 


CGTCAGTACC 


AAAGTTGAAA 


ATCTTTGTTT 


TACCTGGAAA 


GAAGTCCATG 


25080 


TATTTACCGA 


TGAGGGCTTT 


TACAAAGTTC 


ATCGCTTCTT 


CGTTTTTCAA 


GTCCATAGTT 


25140 


GTTTTTGAAA 


CTTTATCAAA 


GTGGGCTTGA 


GGATTTTTAA 


TACCTAATTT 


TTCCATGGCA 


25200 


ACCAGCATAG 


CATCCATGTG 


ACCTGGACTG 


TTAATAGCTG 


GGATGAGACC 


GATGTCCTTA 


25260 


GATTTAGCGT 


ATTCAATTAG 


CTCTGTTACT 


TCTGCCTGTG 


TTAGTGCAGT 


ACCGTTTGGA 


25320 


TCGTCGTAGT 


AAGCTTTAGT 


TCCTTCGATA 


ATAGCTTTTT 


TAACGTCATC 


ACTAGCATAG 


25380 


GTTTTTCCGT 


TGGCAGTAAT 


GGTCATATCA 


TCGAGTAGAA 


AGCGAAGTCC 


GTCATTTCCT 


25440 


AGAAGGAGAT 


GGACATCAGA 


ATATCCGAGC 


TCACTGGCCT 


TGTCTACGAT 


GCGTTTGAGC 


25500 


TGGTTCAGAG 


TAAAGTATTT 


GCGTCCAGCA 


TCGATTGAGA 


TTACCTTGTT 


TTTGGCAAGT 


25560 


TTTTCAACCT 


CACGTTTAGC 


TTCTTCTTCT 


TTTTGAGCTT 


CAGGCGTGAG 


GGTCAAGTTG 


25620 


TTGACAGTTT 


CTTGAAGTTT 


AGCAATGGCT 


TGATCAATCG 


TATCTTGTTG 


GGCACGGCTA 


25680 


AGGTTGCTAT 


CGAGAGAGCG 


AATAGCTTTT 


TCAGCTTCTT 


TTACGGCCGT 


GACGCTTTCT 


25740 


GCAGTATAAC 


GGTTCAGGTC 


TTTTGGTACC 


TCGTTAAGTG 


CTTGCTCTGC 


AGATTCATAA 


25800 


TCAGCTGCGA 


AGTATTCAGC 


GTTGGCATTT 


GCAAAATGAC 


GCATGAGTTT 


GAAGAGGCGT 


25860 


GATGGTGAAT 


AACGTGCAGA 


TGGAGTGTCA 


GCCCAAGCAG 


CTACCATAGC 


ACCGATGATT 


25920 


GGGATATCAG 


CTCCTTCTGT 


TTTTGGTACA GAAGTGATTG 


GTGTGTTTTT 


AATACCATTG 


25980 


AGCCCCTGAT 


CGAGATTGTA 


CCAGCCTTGG 


CCATCAGCGT 


TTCGTCCAAG 


AACGTAGTAC 


26040 


CAAGCATCAT 


TGGTATTAAG 


GATTTGGTGA 


CCTTTTTCAG 


CTAGTAGTTT 


AGAAGAAGCG 


26100 


ACATCGTAGC 


CTCCCCAACC 


ACCAGTCCAC 


ATAGAAACGA 


TGATGTCTTT 


GTCAAAACTA 


26160 


CCAAAGCTTG 


TGTCGCTATT 


GTAGTAGATA 


CCGTCGTTAA 


AAGCCATTGG 


TTTGAGACCG 


26220 


TGCGATTTTA 


CAATACGAGC 


GAGGTCATTG 


GCGTAGGCAA 


TAAATTTTTC 


ATAGCCTTTT 


26280 


ACAGGGTAGC 


CTTCGTTTGG 


ATAGTATTTA 


TCAGCTTGAA 


GCACACTCCA 


ACCTTTAGCA 


26340 


TCTGTCGCAT 


CATTGGCATA 


TTCATCAAGT 


CCGATGTTGA 


AGATTTCAGT 


CTTTTTCGCG 


26400 


AAATAAGCAG 


CATACTTGTC 


GATAAGGGCT 


TTTGTAAAAG 


CGACAGCTTG 


TTCGTTGTCA 


26460 


AGATCGACAG 


TACGGGCTGA 


TTTCTTCCCA 


AAATAGCTAA 


AGTTAGGGTT 


TTGGATTCCG 


26520 


AATTCTTTCA 


TGGCATTGAG 


AATCGCATCC 


ATGTGTCCAG 


GACTATTTAC 


TGTCGGAATG 


26580 
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AGACCGATAC 


CTTTATCTTT 


GGCATAGTTA ATCAGATCTG 


TPATTTGAPT 
1 \~n 111 \J*r\\w 1 


1 Iv, ITj 1 1 AftO 




TGATTGCCGT 


TTGGATCGTT 


GTAATAATCA 


TTTGTACCTT 


1 1 1^/U\1\3U^. 


ftfYWivir'* 


26700 


TCGTCACTGG 


CATAGGTCTT 


GCCGTTAGCT 


GTGATGCTCA 


t 7v TPCTrr a a 

1/11 l.u 1 V-v./a/a 


PiWi a a rvr > 


26760 


AGTCCATCAT 


TTCCGACTAA 


TAGGTGTAAA 


TCAGTGTAGC 


Lnlnnlul 1 1 


t-liCl 1 1 A ICts 


26820 


ATGATTTCCT 


TGAGCTGTTC 


TGGTGAGAAA 


TATTTACGTC 


LftuLA 1 LAn 1 


J\ /"» TV TV TV A % X MUM 

AtiAAACAATT 


26880 


TTCTTTTTCG 


\* i n\3 11111^. 


ATTTACAGTT 


GCAGCACGTT 


CCTTTCCTGC 


CTCTGTTGCC 


26940 


GGTTTGTCAG 


T^V*ll*lUl«l 1 1 


CGCTTCATCT 


TTTTTAGCTG 


GTTTATCCTT 


GTCAGTCTTG 


270O0 


TCTGTATTTG 


1L1 11 n\2i\ 


ATCAACCTCT 


TTCGCTTCTT 


CCTTTTTAGG 


GCTAGCTTCT 


27060 


fPTfJPP'P'FT'P 


TiTTirir'iip'p 

Inl 1 /ivjOio I 


TTCTTTTTCA 


GCAGAAGTTG 


GAGTTACCAC 


TTCTGCTTTA 


27120 




l luAALIAAL 


TTCCTCTTGT 


GGTTTTTCTT 


CTGTTTTTGG 


AAGACTAGCT 


27180 


ACCTTATCAR 


TAP-PTYIfiaPT 


TTCTGTTTCT 


ACAGTTTTTG 


GAGCTTCTGG 


TTGAAGCACT 


27240 


fSP'PTTAP.CI'TYI 


-ill l^-ftO 1 


CCGATTTTCG 


GATGATTGAG 


GGGAATCAGA 


AACCGTATGG 


27300 


AIYSfSTPrV^TT 1 

1AJ(J 1 V^Vjrtj 1 I 


GGTTTTCTGT 


AGTAGTAGGA GTAACTCCAT 


CGGCTGCAAC 


AGTCTGTGCT 


27360 


TGGAAGfiPAA 


A1LLAA1 Inu 


AACAGAAGCT 


GCTCCTACAG 


CGTATTTACG 


AATAGAAAAA 


27420 


V.UV1U1 lOi 1 


TTTCATGTTT 


CATTGCAAAA 


CCTCCTGATT 


GCATTGTTAT 


ATTGATAGCG 


27480 


ATTATATAAA 


TP A APfiPPTT 


TATTTTATTT 


CTTATATTAA 


TTTCTTATAT 


TAACGAGAGT 


27540 


CAAGAGGAGA 


1 unLivuvUVA 


CTATAATAAG 


TATAAAAAAA 


TATAAAATTT 


AAACTTAAGA 


27600 


f'P'PP JV (~l a TTV2 


V»l LbisAAAAA 


ATACGTATAT 


ATATCTAGTA 


TAATTTTTGG 


TTCTATTTCT 


27660 


ATAAAATATT 

<nx#Wv\XflX X 


CPAPAAATTA 


TAGAATTTTC 


CAAAAATAGG 


TAAGCGCTAC 


CTTTTTGGTG 


27720 


TAGTATAATA 


APU^ATARAAA 


AAGCCCAAGC 


GATTAGCTCA 


GGTTTTCTTC 


TTAGTGATCA 


27780 


CGGTCACATG 


AGATAAATTT 


AATCTTGTAG 


TAATCAGATC 


GTTTGTAAGT 


TTCACTGTAT 


27840 


TCTAAAACTT 


GGCCAGTTGA 


TTCGAGTTTG 


GTGATTTTAG 


TTTGTAGGAC 


AGTAGGGAAT 


27900 


TGTTCATCGA 


CTCCGAGGAC 


TGAAGCTGCA 


TGTTCTGGAG 


TTGGAAAGAC 


TATTTCGTTG 


27960 


ATTTCTTCAA 


AGTGTTCATC 


ATTCATGTGA 


ATGTGGTAGT 


CTAACTTGAA 


ACGATTATAG 


28020 


ATAGAACTAT 


AGTATTCAAG 


GTTTGGATAA 


TTTGCGTTGA 


TATATTGTTC 


TGGGATGTAG 


28080 


GATGTATGGT 


AGATATAAAC 


GACACCGTTT 


GATTCGCGGA 


TACGTTCAAT 


CTTGTAGTAG 


28140 


AATTGATCGC 


CGCGTAGACC 


CAATTTTTCC 


AAGTAAACAA 


GCTTGTTTCC 


GCGTTCAATT 


28200 


GAAAGAACAG 


TTACCTTATC 


ATCTTTAGCA 


TTGAAGAGTT 


CAATATCTGA 


AAACTCTACA 


28260 


AGCTTGTGTT 


TGCGTGCACG 


TGAAACGAAG 


GTTCCTTTTC 


CTTGTTGGCG 


GACAATATAG 


28320 


CCATCTTTGG CAAGGTCGTT TAAGGCGCGA ACAACTGTGA TAGAGCTGAC ATCGTACATT 


28380 
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GAAATGAGTT CTGCTTCAGT GTAAAATTTA TCTCCACTGC TAAACTGCCC AGAGATGATT 28440 
TTATTTTTTA ATTCGTCTTT TATGTATTGA TGG - 28473 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6749 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCB DESCRIPTION: SEQ ID NO: 84: 



CCTGATGGGT 


GGTATGCGAG 


GATACAGTTC 


TGAAAATCGC 


CGTTACTTAA 


TTAATGGACG 


60 


CGAAGTCACA 


CCTGAGGAAT 


TTGCTCACTA TCGTGCGACT 


GGTCAATTAC 


CAGGAAATGC 


120 


AGAAACTGAT 


GTGCAAATGC 


CACAACAGGC 


ATCAGGTATG 


AAACAAGGCG 


GTGTCCTTGC 


180 


AAAACTAGGT 


CGAAACTTAA 


CAGCAGAAGC 


GCGTGAGGGC 


AAGTTGGATC 


CTGTTATCGG 


240 


ACGAAACAAG 


GAAATTCAAG 


AAACATCTGA 


AATCCTCTCA 


CGCCGCACCA 


AGAACAATCC 


300 


TGTTTTGGTC 


GGAGATGCAG 


GTGTTGGTAA 


GACAGCAGTT 


GTCGAAGGTC 


TAGCGCAAGC 


360 


CATTGTGAAC 


GGAGATGTTC 


CTGCTGCTAT 


CAAGAACAAG 


GAAATTATTT 


CTATTGATAT 


420 


CTCAGGTCTT 


GAGGCTGGTA 


CTCAATACCG 


TGGTAGCTTT 


GAAGAAAATG 


TCCAAAACTT 


480 


AGTCAATGAA GTGAAAGAAG 


CAGGGAATAT 


TATCCTCTTC 


TTTGATGAAA 


TTCACCAAAT 


540 


TCTTGGTGCT 


GGTAGCACTG 


GTGGAGACAG 


TGGTTCTAAA 


GGACTTGCGG 


ATATTCTCAA 


600 


GCCAGCTCTC 


TCTCGTGGAG 


AATTGACAGT 


GATTGGGGCA 


ACAACTCAAG 


ACGAATACCG 


660 


TAACACCATC 


TTGAAGAATG 


CTGCTCTTGC 


TCGTCGTTTC 


AACGAAGTGA 


AGGTCAATGC 


720 


TCCTTCGGCA GAGAATACTT 


TTAAAATTCT 


TCAAGGAATT 


CGTGACCTCT 


ATCAACAACA 


780 


CCACAATGTC ATCTTGCCAG 


ACGAAGTCTT 


GAAAGCAGCG 


GTGGATTATT 


CTGTTCAATA 


840 


CATTCCTCAA CGTAGCTTGC 


CAGATAAGGC TATTGACCTT GTCGATGTAA CGGCTGCTCA 


900 


CTTGGCGGCT CAACATCCAG 


TAACAGATGT 


GCATGCTGTT 


GAACGAGAAA 


TCGAAACGGA 


960 


AAAAGACAAG CAAGAAAAAG 


CAGTTGAAGC 


AGAAGATTTT 


GAAGCAGCTC 


TAAACTATAA 


1020 


AACACGCATT 


GCAGAATTGG 


AAAGGAAAAT 


CGAAAACCAC 


ACAGAAGATA 


TGAAAGTGAC 


1080 


TGCAAGTGTC AACGATGTGG 


CTGAATCTGT GGAACGAATG ACAGGTATCC CAGTATCGCA 


1140 


AATGGAAGCT 


TCAGATATCG 


AACGTTTGAA AGATATGGCT 


CATCGCTTGC 


AAGACAAGGT 


1200 


GATTGGTCAA GATAAGGCCG 


TAGAAGTTGT AGCTCGTGCT ATCCGTCGTA ACCGTGCTGG 


1260 
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TTTTGATGAA GGAAATCGCC CAATCGGCAA CTTCCTCTTT GTAGGGTCTA CTGGGGTTGG 1320 

TAAGACGGAG CTTGCTAAGC AATTGGCACT CGATATGTTT GGAACCCAGG ATGCGATTAT 1380 

CCGTTTAGAT ATGTCTGAAT ACAGTGACCG CACAGCTGTT TCTAAGCTAA TTGGTACAAC 1440 

AGCAGGCTAT GTGGGTTATG ATGACAATAG CAATACCTTA ACAGAACGTG TTCGTCGCAA 1500 

TCCATACTCT ATCATTCTCT TGGATGAAAT TGAAAAGGCT GACCCTCAAG TTATTACCCT 1560 

TCTCCTCCAA GTTCTAGATG ATGGTCGTTT GACAGATGGT CAAGGAAATA CAGTAAACTT 1620 

CAAGAACACT GTCATTATTG CGACCTCAAA TGCTGGATTT GGCTATGAAG CCAACTTGAC 1680 

AGAAGATGCG GATAAACCAG AATTGATGGA CCGTTTGAAA CCCTTCTTCC GTCCAGAATT 1740 

CCTCAACCGC TTTAATGCAG TCATCGAGTT CTCACACTTG ACTAAGGAAG ACCTTTCTAA 1800 

GATTGTAGAT TTGATGTTGG CTGAAGTTAA CCAAACCTTG GCTAAGAAAG ACATTGACTT 1860 

GGTAGTCAGT CAAGCGGCTA AAGATTATAT CACAGAAGAA GGTTACGACG AAGTCATGGG 1920 

GGTTCGTCCT CTCCGTCGCG TGGTTGAACA AGAAATTCGT GATAAGGTGA CAGACTTCCA 1980 

CTTGGATCAT TTAGATGCTA AACATCTGGA AGCAGATATG GAAGATGGCG TTTTGGTTAT 2040 

TCGTGAGAAA GTCTAAGACA GAATTTTGAG GATAAAAAAG AAGGAGCCAG CTGAAAAAAA 2100 

CTGGTTCCTT TTTAGGTACG ACAGGCATGT CGTATAGTAG AAGTGTATTA TTCTAGTTTC 2160 

AATATACTAT AGTAGCTCAG AAGTCGGTAC TTAAACGTGC TATATCAAAA CCAGTCCTGG 2220 

AAAAACGTGG ACTGGTTTCG TGTTTGGATT ATTACCTTGA ACGACATGCG TTAAAAGTTA 2280 

GTTGAACCGC CGTATGCCGA ATGGTACGTA CGGTGGTGTG AGAGGGGCTA GAGATTATCC 2340 

CCTACTCGAT TTTAAATCAC ATGACGTTCA AAGGCATCAT CTGAAATCCC TTGTTCCAAG 2400 

ATGAGTTTTG CCCATTCTTT AGCAGAGAAG AGGCTGTGGT CCTTGTAGTT TCCGCAAGAT 2460 

TCGATGGTTG TCCCTGGGAC ATCTTCCCAA GTAGTAGTTT CAGCGATTTG CTTGAGCGAA 2520 

TCCTTGATAA CAGCTGCGAT TTTAGCACTG GTGTGACGTC CCCACATAAT CATGTGGAAG 2580 

CCTGTGCGGC AACCAAATGG TGAACAGTCA ATCATGCCGT CAATGCGGGT ACGGATGAGT 2640 

TTGGCTAAGA GGTGCTCGAT AGTGTGAAGG CCGGCAGTAG GGATAGAGTC TTCGTTTGGT 2700 

TGCACCAAGC GAATATCATA ATTGGAGATG ATGTCTCCTT TTGGTCCTGT TTCTTCCCCA 2760 

ATCAAGCGAA CATAGGGTGC TTTGACAATG GTGTGGTCAA GTTCAAAACT TTCGACAATA 2820 

ACTTCTTTTG ACATGGTAAA TCCTTTCAGT TTTCTTCTCT CATTATATCA TAAAGGTTGC 2880 

TCCTGAGACA GAGAGAAAAC CTCTCCGAGG CTGGAGAGGT TGAAATCTTT ACTTACGATA 2940 

TAAGCGGTCG TATTGGTAGT ATGGGTCAAA GGTTACGTTG ATACCCAGTT TACGAAGGAC 3000 

ATTCTTGTCT TCATCAGTCA AGATGATGGT TGAGTGGGCT TCGCTTCCTT TGAGGTTGCC 3060 
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GAGTTCTTCC ATAGCGCGGG CAGCATCAGG ATTTTCTGTA GCTGTGATAG CAAGTGCAAT 3120 

CAGGATTTCA TTTGAATGAA GGCGTGGATT GCGGCTACCG AGATGATCGA TTTTAAGACC 3180 

TTGGATTGGC TTAACAACTT CAGGCTCGAT TAGTTTTACT TCTTTAGCGA TGTCAGCTGA 3240 

TTTTTTGATG GCGTTGATCA AGGCAGCGGC TGTAGGACCA AAGAGTTCTG AGTTCTTACC 3300 

AGTGATGATT TCCCCATTTG GCAATTCAAA GGCTAGGGCT GGTCCACCAG TTTCTTCTGC 3360 

TTTTTGGCGC GCAACGACAG CAACCTTACG GTCTGCAGGT GTGATACCGA GGTCGTTCAT 3420 

GAGCAACTCA ATTTTCTTGA CGGCAGCTTC GCCAACTTTT TCAGCTTTGA AGTCAAGAAC 3480 

TGTTTGATAG TAACGGCGGA TGATTTCTTG TTTAGAAGCT TCGACAGCGG CCTCGTCATC 3540 

TGTAATAGCG AAACCAACCA TGTTGACACC CATATCTGTC GGTGAAGCGT ATGGTGATTT 3600 

TCCGAGAATA CGTTCCAACA TGCGTTTGAG C ACTGGGAAG ATTTCGAT AT CACGGTTGTA 3660 

GTTGACAGTG GTTTCTCCAT AGGTTTGAAG ATGGAAGGGG TCAATCATGT TGACATCATC 3720 

AAGGTCAGCT GTGGCAGCTT CATAAGCCAA GTTAACTGGA TGATGAAGGG GAAGATTCCA 3780 

AACAGGGAAG GTTTCAAATT TAGCGTAGCC AGATTTGATG CCATTGATTT GGTCGTGGTA 3840 

CATATTGGAC ATACACGTTG CCAATTTTCC AGAACCAGGT CCAGGAGCGG TTACGACAAT 3900 

CAAGTTGCGA CTGGTTTTGA TGTAGTCGTT TTTGCCCATG CCTTCTGGGG AAATGATGTG 3960 

ATCCATATCC GTCGGATATC CTTTGATTGG ATAATGAAGA TAAGAATCAA TTCCGTTTTT 4020 

CTCAAGTTGA TTGCGGAAGG CATCTGCAGC GGGTTGGCCA GCGTATTGTG TAATGACAAC 4080 

GGAACCAACA AAAATCCCTA ATTCATTGAA TTTATCAATC AAACGAAGAA CTTCTTGGTC 4140 
ATAAGAAATG CCTAAGTCGC CACGTGCTTT GGAATGTTCA ATGTTGCTAG CATTAATGGC . 4200 

AATCACAACC TCAACCTGCT CTTTCAATTC TTGCAAGAGC TTGATTTTGT TGTCAGGTTC 4260 

ATAACCAGGA AGGACACGAG CAGCGTGGAA ATCTTCTAAC ATTTTACCGC CAAACTCTAA 4320 

GTAGAGCTTG CCGTCAAATT GGTTAATGCG CTCCAAAATA TGGTCGCGTT GTAAATTCAA 4380 

ATATTGTTCA GAACTAAAAG CTTGTTTTTT CATTTTTTTA CCTCTGGACT CTATTATAAT 4440 

AAAAAATTGG AAGTTAGGAA ACTACGGAGC TAAAAAAGAA ATTAAAAAGA TTAAGCAAAC 4500 

GCTTGCACAA AATTTTAAAA AGTGCTATCA TAGACTATAG ATTATGAAAA TAATGAGGTA 4560 

AACAGATGCA AGAAAAATGG TGGCACAATG CCGTAGTCTA TCAAGTCTAT CCAAAGAGTT 4620 

TTATGGATAG TAATGGAGAT GGAGTTGGTG ATTTGCCAGG TATTACCAGT AAGTTGGACT 4680 

ATCTAGCTAA GCTAGGAATC ACAGCAATTT GGCTTTCTCC CGTTTATGAC AGCCCTATGG 4740 

ATGATAATGG CTATGATATT GCTGATTATC AAGCGATTGC GGCTATTTTT GGAACCATGG 4800 
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AGGACATGGA TCAGCTGATT GCAGAAGCTA AGAAGCGTGA 


CATTCGTATC 


ATCATGGACT 




TGGTGGTCAA TCATACCTCA GATGAACATG CTTGGTTTGT 


CGAAGCCTGT 


GAAAATACTG 




ACAGCCCTGA GCGAGACTAC TATATCTGGC GCGATGAACC 


CAATGACCTA 


GATTCTATCT 


4980 


TTAGTGGGTC TGCTTGGGAA TACGATGAAA AGTCAGGTCA ATACTATCTC CACTTTTTCA 


cn^ n 
bU4U 


GCAAGAAACA GCCGGATCTC AACTGGGAAA ATGAAAAACT 


TCGCCAGAAA ATTTATGAGA 


5100 


TGATGAACTT CTGGATTGAT AAAGGTATTG GTGGTTTCCG 


TATGGATGTT 


ATTGACATGA 


5160 


TTGGCAAAAT TCCTGACGAG AAGGTAGTCA ATAATGGTCC 


TATGCTCCAT 


CCCTATCTCA 


5220 


AGGAAATGAA TCAGGCGACC TTTGGAGATA AGGATCTCTT 


GACAGTAGGG 


\3f*\Jt\K- X X VJVjV? 


5280 


GAGCAACTCC AGAGATTGCC AAGTTCTACT CTGATCCAAA 


GGGGCAAGAA 


X X VJ 1L1 r\ X 


5340 


TCTTCCAGTT TGAACATATC GGTCTTCAGT ATCAGGAAGG 


TCAGCCTAAA 


* wVJV#nv* X n X V* 


5400 


AAAAAGAGCT GAATATCGCT AAGTTAAAAG AAATCTTCAA 


CAAATGGCAG 


X Xnv 


5460 


GAGTTGAGGA CGGCTGGAAT TCCCTCTTCT GGAACAACCA 


TGACCTCCCT 


V* vJ X *\ X X u X X 


5520 


CAATCTGGGG AAATGACCAA GAATACCGCG AAAAATCTGC 


CAAAGCCTTT 


GCAATCTTAC 


5580 


TTCATCTCAT GAGAGGAACT CCTTATATCT ACCAAGGTGA 


GGAGATTGGG 


ATGACCAACT 


5640 


ATCCGTTTGA AACACTGGAT CAAGTAGAAG ATATTGAATC 


TCTCAACTAT 


GCGCGTGAGG 


5700 


CTCTTGAAAA AGGTGTTCCG ATTGAAGAAA TCATGGACAG TATCCGTGTT 


ATTGGACGTG 


5760 


ACAATGCCCG TACCCCTATG CAATGGGACG AGAGCAAAAA 


CGCTGGTTTC 


TCAACAGGTC 


5820 


AACCTTGGTT GGCGGTTAAT CCAAATTACG AGATGATCAA 


TGTCCAAGAA 


GCGCTGGCAA 


5880 


ATCCAGATTC TATTTTCTAT ACCTATCAGA AACTGGTCCA 


AATTCGCAAG 


GAGAATAGCT 


5940 


GGCTAGTTCG AGCTGACTTT GAATTGCTTG ATACGGCTGA 


TAAGGTCTTT 


GCTTATATAC 


6000 


GTAAGGATGG CGACCGTCGC TTCCTAGTTG TGGCTAACTT 


GTCCAATGAA 


GAGCAAGACT 


6060 


TGACAGTAGA AGGAAAAGTC AAATCTGTCT TGATTGAAAA 


CACTGCGGCT 


AAAGAAGTAC 


6120 


TTGAAAAACA GGTCTTGGCT CCATGGGATG CTTTCTGTGT 


GGAATTACTA 


TAAATATTTT 


6180 


TTGCAGAAAA ATTTAAAATT GAAATCGTAT AAAAACAAGG 


GAGGACTGTA 


TAAAAGACAG 


6240 


AAATCCTTTG TTTTTTATAA CCAAAGTTTA TAAACTTTCA 


TTCTTGAAAT 


TCAATTAACT 


6300 


TTACAAATTC CCACTATTAA GGAGAAAGAA GATGAACATA AAGAAGCGTG TCCTTAGTGC 


6360 


AGGCCTGACT TTTGCATCTG CTTTGCTTTT ACCCAAATCA 


TTCATACCTC 


TCTCAACTAG 


6420 


ATGTAACTTA CAAAACCCCT GACCTCATGA GCCACTTTCT 


TCCTCCTCAT 


GAGGTCAGTT 


6480 


TTACTTTCTG CTGTTCCAGT ATCGTTTTTC CTCGCTAGAT TTCCTCAAAA GGGCAGACTC 


6540 


CTCCCTTGGT GCGTCACACG ATTTTTTCAT CTCGACTGTT 


CTTTAATGCA 


TCATTAACGA 


6600 
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CGCTTTTCTT CTAGGTGGTT CATAAGGAAC AGGAAGATTC AGGTTGACTT TTCTAATCCT 6660 

AGAATAAAGT GCTGAAAACA ATTCGGAATA GGCATAGAGA CTAGACAATT TGAGGAGCTG 6720 

CTTGCGTCCT GTTCGAACAC ATTTTCCGG 6749 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1842 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
(DJ TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

TCTACCCATG GACTTTGAGG CATTCATTGT TCCATCTTCT AGTGGCGAAT CTTTTGATAC 60 

AAACGATTCA ATTCACTTGG ATAGTGAAAC TCTCCCGCAA ACATTTTTCT GGTTAACTCA 120 

ATCCAGCTGA TATTTCTTTC AGCCAAAATA ATGGACAAGT TCTCCCAAAA TCGTTCAGCC 180 

ATATTGCTTC TCCTTTAGTT AGATAAATAA TGTGTTTGCG CCATGTAAAT CAATTGTTTC 240 

GTATCTCTTG GCAATAGAGC TCTAGCCTCT TCCAAATTCA GACTTGGATA AACTCGCTTA 300 

TTTGAAACCG CAAGAGGAAG TCTGATGGTT AGTTCAGGAT TTTTTAAAAT TATCTCAACG 360 

AAATCCGTTA ATCTTAGATT GTCACGGTTC TTAAATCGTA ATAAATTGGG AGATAAAAAC 420 

TCAAAACAAT CTGAAGAATA GCTCATCATC TCAATTAATT TGTCCTTTGT CATTTCAGAA 480 

ACTGAATGAC AAGATACCTC TATGCCATAG TTTTGGAAGA AATCTAAAAG AAGTTGATTT 540 

CTTTGTCTAT TTTTACTTAG ATAGAGATCA ATCATGGGAG ACCTCCCAAA GATTCGGTTC 600 

CATTTGATAT TCTGACACGA TTAAGGAATC TAATAAATTA AGGAATCTAA TAAATTTGCG 660 

AAGTTAATCG GTTTCTTGTC TTCATCATAA GCTTTTACAG TTACTTGGGT TGTAAGTATT 720 

CCCTCTTTTC CCTCGGCTCG ATAGCCTTGT CCATATAAAA CAAAAACGAG ATTTTGATGA 780 

TCATCTACAA AGGCATCAAC CCCATTCTTT ATGTCTTGAC TTTCAAGGAA TTCCATAACG 840 

TTTTGAAGAT AGGATTCGTA AAATAGTGGG TAGTTATGTT TTTTATGGTA ATCATCTAAA 900 

AATGTCACTT CAAACTCACA TGGAGAGTAA TTTTGACTTT GAACAGCCTA AAAGTGCCAT 960 

CAAATTTGAA TTGGAATAAA TCAAATAAAT AGCCCCATCC TCATCAATCC AACCTTTGCT 1020 

CAAAGACAAC TCCAACCGAT CTTTTAAAAC TGAGTAAACC ACCTTAACCT CCAGTTTCAT 1080 

ATTCTTATAC CGTTCACTCT CAAATAAAAG TTTGGGGAGC TTATAATAAC GCTCTGATGT 1140 

CTGATATTGA TTAGCGGTAA TACGCTTCAT TATTGTCCCT CCAAGACTAA AATTCCAACA 1200 
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TTTCCAAATT CATCAAATCG GATTAAACCT ACTTGTTCCA TTTCATCAAC TAACTGAGTT 1260 

GCTTTTACCC AAATCATTCA TACCTCTCTC AACTAGATGT AACTTACAAA ACCCCTGACC 1320 

TCATGAGCCA CTTTCTTCCT CCTCATGAGG TCAGTTTTAC TTTCTGCTGT TCCAGTATCG 1380 

TTTTTCCTCG CTAGATTTCC TCAAAAGGGC AGACTCCTCC CTTGGTGCGT CACACGATTT 1440 

TTTCATCTCG ACTGTTCTTT AATGCATCAT TAACGACGCT TTTCTTCTAG GTGGTTCATA 1500 

AGGAACAGGA AGATTCAGGT TGACTTTTCT AATCCTAGAA TAAAGTGCTG AAAACAATTC 1560 

GGAATAGGCA TAGAGACTAG ACAATTTGAG GAGCTGCTTG CGTCCTGTTC GAACACATTT 1620 

TCCCACCACG TGAAGAAAAA GATGGCGGAA GCGTTTGATT GTTAAAGTTT GGAAGTCACC 1680 

TCCAGCTAGA TGTTTGAGAA AAAGATAGAG ATTGTAGGCG ATACAGCTCA TCATCATACG 1740 

AACTTCGTTT TTGATTAAGG TTGAACTATC CGTTTTATCG CCAAAAAATC CCTCCTTCAT 1800 

CTCCTTGATG AAATTCTCGG CTTGACCACG TCCACGATAA AG i 842 
(2) INFORKATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19390 base pairs 

(B) TYPE: nucleic acid 
(C> STRANPEDNESS : double 
(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

TCATCTTTAT CTCCTCGAAA TTTTCTAATA TAGCCATTAT AACAGAATTT TGTGAAAATT 60 

CCTATTATAG TAAATCACTA TTTCAGTATA AAAAGAAAAA ACGAATCAGA CGATTCGCTC 120 

TTCTTAAAAT CTGAAAATAG CTTTCCAGAA AGGATTAGCC GATTTTTTGC AGATTGAGCA 180 

CTGCATCGTG ACTCATCAAG ACTTGACCAT ACTCTTGTAA GACTGAGCGA CTGATATCAC 240 

TATCGTCTGC AAACTCGCGC ATACGGGCCA ACAGCCAAGC TGGATATGGG CTTGGATGAT 300 

TTTCAATATC CACTAAAATG GTCAAATAAT AGCGCTCGTT CATTTTGTAG AGTTCAGAAG 360 

TTTCCATTTC AAAAGTCACT GTCTTGGCAA AAGCTACCAA GTCAGCCAAC TTAGCAAAAG 420 

AAAGGATGTA GTAGATGTAA GGTTCTTTCT TACTCTCAGC TTCTTGTTCA GCCTGCTCTT 480 

GCTCTTCTTC CTTGACTTCA ACTTGCTCAA GAGATTGAAT GGCTTCGATA TCATCCTTGG 540 

TTTTGTCTGC GATGCTTTTT TCCAGGGTTT TGATAAATTC ATCTGGAGAC ATTTGAGCCA 600 

ATTCTTCCAT ATCTGGCAAA TCCGATAAGT CTTCAAAATC TAGATTTTGG TCAATCTTTG 660 

ACTTGGTCAC AAAGACATCT ACCTTATCAG GTTTTGGAGT CACACGGAAG CTCAACATGC 720 

CTGTATCCAG AAAGCTATCA GGCATCTCTA GCTCATCCAA GATAGCATAA AAGAACTCTT 780 
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CTGTTTTTTC 


TTGAGGAACG 


AGAAAGTCAG 


CAATCTCCAT 


TCCACGATCC 


ATCAAATCCT 


840 


CTAAAGATAT 


CGTGATTTTT 


AAAGTTGTAT 


CACTAATTTG 


TTTCATTTTC 


ATTGCTAGTA 


900 


ACCTCATACT 


TTCAGTTCTA 


TCTATTATAC 


TAGATTTTTA 


CGATTTTATC 


AAAAGAAGGC 


960 


TCCTCTATAC 


GGATAGATTT 


TCCCTAGGGT 


CTTTCTATAG 


GAGACTCCAA AAGAAAATTT 


1020 


CTGCAGACAG 


ATAGAAAAAG 


CCTTCAAAAT 


CGGCTAAGAG 


CCGACTTTGA 


AGACCTTATA 


1080 


CATCAGAATA 


CTTATAATTT 


AAAGGTTGCT 


ACACCGAGGA 


TAGAACGATT 


TAAGTTTCTG 


1140 


AGAATTTGAA 


GACTTTGCTC 


AAATTTCTTA 


TAACGAGTCA 


CTCCGTACTC 


TTCAACAAGA 


1200 


AGGACTGTAT 


CTCTTTCCAA 


AAGAGATGAT 


ACATCCTGTA 


AATCTACAAA 


ATGCATTCCT 


1260 


TTTAAAGCTT 


CTTGACTCTG 


TTTCAATTTA 


TCTAAGATAG 


CTTTATTTGA 


GCTAACGATG 


1320 


GTCAATTCCT 


GTCCAGTATT 


TTTGTATGAG 


AAAACATCTG 


CTAGGTTAGG 


AATTGTTGTA 


1380 


ATCTCTGTTA 


CAAAATCAAT 


TTGATACTGA GAAAAATCAC 


CTACTCTATT GATTGTTGGA 


1440 


TTAAAGAGAT 


AAACTAACAC 


ATTTCCCATC 


ACAACCAAAA 


TCACACAAAC 


CACTCCAATA 


1500 


ACAACTAAAC 


GAAGAATCAG 


ATTTTTCACA 


TTTAAGCCAA 


GCGCTGTTTC 


ACCATTTGCG 


1560 


TTCAATTCTT 


TAGAGTTGAT 


GGTTTCCAGT 


TTTTCAATTT 


TCACATTTGC 


ATAGGCATGT 


1620 


TTAAATTTCT 


CAATCAACCC 


ATCAATTTTT 


TTCTGTAACA 


AGTTATTGGC 


ATCTTTACTT 


1680 


GATGTCAAAA 


TTTTCACACC 


AACCCCTGCA 


TCGTCAATCA 


TATAGTAGAC 


GGTCAATTTT 


1740 


TTCCACCAAT 


AGTCATTCGT 


TGAATTTTTC 


AAGGTTGTTT 


CTGTCGTGTC 


TAATTCACTG 


1800 


GCAATTTTTT 


TCAACTCACT 


GGGTTCTACA 


TCATTGAAAA 


GATAAGCTCC 


ATTCAAATTA 


1860 


CCATCAATCA 


ATTTCCCATA 


AAAATCACTA 


TAACCACCAA 


TTTGATGATT 


CAAAATCGTT 


1920 


TTGTCCGACT 


GTTTTGGAGG 


AGTGATTTTA 


TAGATAAGAT 


AAGTTGAATA 


ACTTGTTGTA 


1980 


TCTTTGACAG 


TGTTTTTATT 


CCTAACTGCT 


TTAATTGTAA 


ATGGTACAGC 


AATGAGAGCA 


2040 


AATAAAGCGA 


TGAGAGCTAA 


AATATTTGCT 


TTTCGCTTTT 


TATAAAGATT 


TGCAAACAAA 


2100 


TCAGCTACTG 


AATAATGTTC 


AAACATGATT 


TTTTTCTGCT 


TTGTTTAGTA 


GATACTAGTT 


2160 


TTCCTTTGTA 


AGCATTTTTG 


CTACAAATAT 


AATCACAAGA 


ACAATTCCCC 


AGAATTGCAT 


2220 


TGTAAATAAA 


TTGAAGAAAC 


TTTCTGAAAA 


GCTGCTTCTT 


GGCATAAAGA 


ATAGATTATT 


2280 


CAAGATGAGT 


AGGGATAAAG 


CAAATAGGAT 


TGTCCTTGAG 


CGATAGGCTA 


CTTGCAGCAT 


2340 


GGCTATAAAT 


AATACGCCGA 


GTAAGAAACT 


AAGCAGAAAG 


ACTCCAATCA 


TACCATAGTC 


2400 


GGTATACAAC 


TCCATGATAT 


AACTACTTCC 


GATACCATGC 


CCTTTCAAGT 


ATTCCTTGTT 


2460 


CAAGACAAGA 


TAGGATAGAT 


TGTGGGCATA ACTATTACTA 


TCAATAGCTA GTTCGACACT 


2520 
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ATTGGTTGTA TGTTCAAAGG CTTTTCCTCC GAAAATGGCT 


CCCAAACTCC 


CCCTTGCAAA 


ajoU 


ATAATCAAGA ACAGGACCAA AAGTAAAATT ACGGAAATCT 


CGGTAAGGGA 


GGCTACTGTT 




AAATAGAAAA CCTCGAGCCA GAACACCAAA ACTAGTCCCT TGTTTATAGA TAAAGTCAAG 


2700 


TAAGATATCC CAGAAACCTG TATGGGAAAC TTGGACATTA 


TCCCGTACAT 


AATTGAGTAC 


2760 


TCCCATCGCT AACATGAGAA TAGGAGAACC TACAAAAATC GCTAACTTTT 


CTTTAAACCC 


2820 


AATCCATTTT CCTTTTTCAG TTTGCTCCCG CATAAAGTAA 


TAAACAAAAG 


CAAATmAAAT 


2880 


ACTTAAAATA AAGGGATTTC GTGTCCCAAT TGCCAAATGA 


ATAGTATTAG 


CTGCAATAAA 


2940 


. GGAGACAAGC ACTGCTGTGG CCTGCAATTT CTTTGGCTTG 


GTTGCCAGAT 


ACATACACAT 


3000 


TGCATAGACC GTAAAGGTAG ACAAAATGTA GGTAAAATAA 


GGCAGTTTAC 


TTTPAAAATT 

A A X 1 A 


3060 


TGCATAGTAG GCATAGTAGG AAGTCTGCAA ACGATACAAG 


AGCCGTTCAA 


ATAACCGAAT 


3120 


GAAATAGAAA GGATAAGTTA GAAGAAAAAC TCCTAGTGAT 


ACAAAGCGTA 


ACCGCTTGAT 


3180 


ATAAACCTCT TTTAGAGAAT TTCCTATATT TGCTACTTTT 


ATTTTCTTCC 


TAGCTATGAA 


3240 


GTAACGAGCC AGAATGCCTC CTGTGGTCAA GCCCAGAATC 


GAAATCATGA 


CAACTATAAA 


3300 


GGCAAAACGA TAGGCTATTG GATGATAGGT ATCCAAAGCA 


CCATCCCTAA 


AATAATfAAT 


3360 


GGTCGGTCTT GATACCAGAA ATACAAAAAT GGTTAAATAG 


AAAATAAAAT 


GGATTAAGTA 


3420 


ATACTTGATA TCATTCCAAC AAGCAATTAA GCTACTAACC 


AACAAGAACA 


ATAAAGTAGA 


3480 


AAGTAAGCTA ACATTATTAT TATTAAACAG ATACACAATT 


CCACTTACTA 


GCGTCAAGGC 


3540 


ATAACTGACT ATGGTCAAAC TAAATAATAA TCGTTTCCCA 


TCAATCACTT 


GGTCACCCCC 


3600 


GTTCTAATGT AATTTTTTAG ATTTTTCAAT ATTTTTCAGT 


AATAAGAATC 


\sA r AT AAGG A 


3660 


AATATTTATG AATAGGGCCA AAGCACTAAT TCTTCTCCCC 


TTACGGAAAA 


TTGGATTCCT 


3720 


AGAAATAGCA AAGGCATGGC CTTTTAAAAA ACGATGAATC 


TGAGAATAGG 


CTTCAAACTG 


3780 


TTTATACTGA TCATCTAGCA ACATCTTATC CAGAATAAAG 


AAGTGGGCAT 


AGGCCAATCT 


3840 


GAAAAAAGCG ACCTCTTTCA AGTCAGGATA GTTTTTCACA ACTTCATTAT 


AAAACTTTTG 


3900 


GTAGATATCA ATATAGGCTA AATCCTTCTC TGCATAGGGT 


TTGGTCGTAA 


TACTATCCCC 


3960 


TCTATGGAAA TAGTAATAAT AGGGTTTAGT ATTAACCACA 


TACTTCTTGG 


CCAACTTGAT 


4020 


TAAATCAAAA TGGTAATAGG CATCTTCGTA AATCAACCCC 


TTAGGAAAGG 


ATAGGGCAGT 


4080 


TG CAATCTGT CTCTTGATTA GCTTATTGCA AATCGTCCCA GGTATTTTTT 


CACCTATGAG 


4140 


GTATTCCTTT AGAAATGTTT GAGAATCACA GACAAAATAG 


TCATCCTGAT 


TGGCTGACTG 


4200 


TGGGCTTTCA TCATTAGCAT AGACATTCAT GACACCACAG 


CTCGAAACAT 


CCGCATCTTC 


4260 


TTGAACTAAT TGCTCATATA AGCTCTGAAT CATTTCTGGA 


TGGATATAAT 


CATCTGAGTC 


4320 
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AATAAAAATC 


AGATAATCCC 


CGTGAGCCTG 


CTTCATCCCA TCATTTCGTG 


CTTGCGACAA 


4380 


TCCTTCGTTC 


TTTTTATGAA 


GCACTGACAC 


CCTGTCATCT TGTTCAGCGA 


TTGAATCACA 


4440 


CAAGCGACCA 


CTTTCATCTG TTGCACCATC ATCAACAAGA ATAATTTCCA GATTTTGATA 


4500 


GGTCTGCTTC TGAATGGAAG 


CTATCGATTT 


TTCTAGGTAC TGCGCCACAT 


TATAGACTGG 


4560 


CACAATCACA 


CTAATTAATG 


CAGTTTCCAT 


GCTACTCCTC TAATAGTTTT 


TCTACTTGTT 


4620 


CGATTTGTTT TGTAATTGTA AATTGTTGAA TGAATTGGCT AGCCTCATCG 


ACATCAAAGT 


4680 


TTGAGGCAGA 


AGTCATGTAA 


TTAGTAATCG 


CCTGAGCTGC CTCTTGATTG 


CTCTCAATGA 


4740 


TTTGTCCAAA TCGTCCTTCT TGGGATAATT CCTCAGCCCC TCCAACGTCC 


GTAGAGATAA 


4800 


AAGGGAGTCC 


CAGACTCAAG 


GCCTCCACAT ACACTCCAGG AAAACCTTCT TGTTTAGACA 


4860 


TAGACAAAAG 


AACTTTCGTC 


TGAGATAGAT 


ACTGATAAGG ATTTTTTTGA 


TAACCAAGGA 


4920 


AATGTACATA 


GTCCTCAATC 


CCATACTCTT 


TGACTCGTTT TTTCAGTTCC 


TCTTCCATAT 


4980 


CACCAGCCCC 


GATAAAATAG 


AGATGATAGT 


TTTTTCCCTC TTGGTGTAAT 


AATCGTATCA 


5040 


CTTCCACTAC 


ACGGTCAGAA 


CCCTTATTTT 


CCTCAATCCG TCCGATAGTA 


CAGATACTTT 


5100 


GAGGAGCAAT 


CTCGATATCG 


ATCTTCTCTT 


GAGATTTTTC TAGAATAGTC 


TGAAAATCAT 


5160 


ATCCATTGTA 


GATTGTCTGT 


AATTTAGAAG 


TATAATCTGG ATAAACTTCC 


TTGATAGAAT 


5220 


TGCTGGTCTT 


TTTTGAAATC 


CCTACAATTG 


TATTCGCAGC ATCCAACTGG 


CTTCTATGTG 


5280 


ATTCTCTTTT 


AGAGCTATCC 


TTAAGAAGTT 


CTTCAATACT TCCATGAATC 


CAAGATATCT 


5340 


TCTTGACTTC 


TCTTCTTTTA 


GAGAACAACA 


GTGGTGGATT CATAATGGTA 


AAAGAAACTT 


5400 


CAACATCATA ATCATCTTTT ACAAGCAAAC GACGAGTCAG TCTTGGAAAA 


TAAATTCTCA 


5460 


TTCTCCACAA AAAAGCTCGT AACCATCTGG TTTGGCGATA ATCTTGAAGG 


GATTTTAAAA 


5520 


TGCGTACATG 


CTTTGGAACA 


GATTCATATC 


CCTTGTCAAA GTGCTCCATT 


TCAAGAATAT 


5580 


CAATATCATA 


CTTTTCTGGA 


TCCAGATTTG 


AAACAATGGT TGATAGAATC 


TTCTCTGCAC 


5640 


CACCTCCAAG 


AGAAAAAGAC 


CACATAAAAA 


ATAAGATTTT TTTCTTAGCC 


ACCATATTCT 


5700 


CCCTTGTATT 


CTGTATAAGA 


CTTATCCATA 


TCAGCGATGA CAGCATCATG 


ATGCGGTACC 


5760 


TGCTTGTCTG 


CTGGTGGAGG 


CGTCATATAA 


TCCCCAAAAG CAGTTCTGAG 


ATAGACATCA 


5820 


TAGCCGATTG 


GAATAGGCAT 


CTCTGTTCCT 


TCAAATGGCA AGAAAAGATT 


GTCTTCAAAA 


5680 


GATGTGATTG 


GGTACTTGTT 


TCTCATGTAG 


CCAGGACCTG AGCATAATTC 


TGTAATGCCA 


5940 


TCACAATCAG 


CCAAATCATA 


CTTAGTCATT 


TCTTTCTCAG CTTTTTTCCA GATGCGATAA 


6000 


CGGAGAGATT 


TTGGAGTCAA ACCCAGTAAA ATGCGACTTC CCCATTTCAT GAGATCACCA 


6060 
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TGCTTTTCTG GAATAGTTTG CGCACAAAAG AGTGAATAAA TCAAGGCCCA ACGAACCTGT 6120 

TTTTTCCGCT CAGCTGGATT TTTCGGATAA TAATCCAAAG GCAAAACATC CAAGGCCAGA 6180 

CCATGTGGCA AATCCAAATC CTGCTGATAA GGCTTGATAC AGGTGGTTTT CTTGTCACGA 6240 

ATGGTAATAA AAAGATTACG ATCAACAAAA TCCTTGTGAC TCTTTGACAA GAAATAACGT 6300 

TCATCTGCAT AACGAGGCCA TAATTCTGCT AATTTCTCAT AATCTTTACG AGGCATAAAA 6360 

AAGTCTAGGT CGTCGTCCCA AGGAATAAAT CCCTTGTTTC GAAGGGCACC AATAGCGCCT 6420 

CCGCCACAGA GATAACAGAG CAAATCATGT TCTTTACAAA AGGCCACAAA ATATTCAGCC 6480 

ATCTCCAGAC TACGAGCCTG AATTGCTTTT AAATCAGTCA TATTGTTCAT TATTCTTTCT 6540 

ATCGTATCGT TTCATTATAC CACAAACAAG GGGTGAAAAT CTATTGCAGA CTGTAAAAAA 6600 

TCAAAGCCTG ACTGCTATCC AAATAGCTAT CAAACTTTGA TTTTTCTGTC TTATACTCTT 6660 

CGAAAATCTC TTCAAACCAC GTCAGCTTCA CCTTGCCGTA GGTATAGGTA ACTGACTTCG 6720 

TCAGTCTTAT CTACAACCTC AAAACTGTGT TTTTAGCAGC CTGCGGCTAG CTTCCTAGTT 6780 

TGCACTTTGA TTTTCATTGA GTATTATCTT ATCTTAAGCC CATTTGAGCG AGCTTGGTTT 6840 

GATATTTGTT TTGATCAACC AGCAGGCCCA AGCCCCCATA AACATCATAG GCATCTACCC 6900 

AGTCACCCAG TTCTGGAATC GTCAATTTTT CAATACCATT TTTTGCTCCA TCCAAAACAG 6960 

ATAAACCGTT TGTTAGGAGG AAAGTATAGG GTACGTTGGT TGAGGTCATA GCAAAAACCT 7020 

TTCCAAGAGC TTCAGAACCA GTGAAAAGTT TAGTGGGATC TTTAATTTGC TCTAAAATTG 7080 

CTGTTAAAAC TTGTTGCTGT CTTTTTGTAC GGCCGTAATC TGCCTCATCA TCATCACGGA 7140 

AACGAGCATA ATTGAGCAGG GTCGAGCCAT TCATCTGCTG TTTTCCGACT TTAATGGTTT 7200 

GGGTTGGAGA CTCAGTCTCG GTAGCGTATA AATCATCTCC GACTGTAGCT TCTGTTAGGG 7260 

GAOGCCCATT CAATGTTGAA AATTGAGCAT CAATCGTCAC CCCATCAGGG AAAAGCGTGT 7320 

CAATCGCTGT GGCAAAGGCC TGGAAATCAA CCAAGGCGTA GTACTTAATG TCCAAGTCAA 7380 

AATTATCTTT CAAGACTTGG CGAACCATTT CTGCCCCTTT TTGCCCCTCT TGTTCTCCTA 7440 

ACTCGTAGGC TACGTTTAAC TTGTTATCTG TCTGTTTTCT ACCATTAATC ACTTGACTAT 7500 

AACCATCTAT ATAGACCAAA TTATCACGCA TGAAACTGAC TAGCTTCATT TTCTTATCTG 7560 

AGCCCCCGAC ATTTAATACC ATAATAGAGT CAGTTCGTGT CTCAACACTG TTCTGGCCGA 7620 

TTCGACCATC AGTACCCATG ATTAAAATAT TAACTCCATC TCTAGTGTCC TGACCATTAA 7680 

AGACTTCTAC TTGAGCTGCC CGGGCATCAG CAGTTTTCTT TGCGCTAGCA TCTTGGTAAC 7740 

CACGCAAAAA CATGAATACC ATGGCCAAAG CCACACAGAC CAAAAGTGAA AAAATCACCA 7800 

TAAAAATTCG TTTAAGACGG AGCTTCCGTC yn TO Tm 1 TGGAGGGAAA GAGAGTGCTT 7860 
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GTGATTTGGA 


TTGTGAGCGA 


CTCCGGTTCG 


CATAGCTTGG 


TAAGTCAACC 


TGCTCTTCTC 


7920 


TTTCTTGTTC 


CAAGCTAGAG 


CTACTATTTC 


CCCTAGCAAG 


AGTTAGCTTT 


TCTTGCAAAT 


7980 


AGGCAAACTC 


ATTTTTTTCT 


CTCTCATTGA 


GATAGTGAAT 


ATTTTTTAGC 


AAATAATCAT 


8040 


AACGCAACTG 


CTCATGATGA 


GTTAAGGGAT 


TTTCTTTACT 


CATCTTCTCT 


CCTTTCCATG 


8100 


GTCTGATATT 


GGATAAATAG 


GATAGGCACC 


CAGAATTTTA 


TACTGGATTC 


CAATCGCTTC 


8160 


TAATTCTTTT 


TGGGCAAAGT 


GGACCAAGTC 


CTTATCGGTA 


TAATCCACAT 


CGATAATGAA 


8220 


AAAGTATTCA CCCAGTGCTG TCTTGAGTGG ACAACTTTCA ATTTTTGTCA AGTCAATTCC 


8280 


TCGCCAAGCA 


AAGGTCGACA 


GGGCCTTATA 


AAGTGCACCT 


GGAAGGTTGT 


CAGGTAATGT 


8340 


CAAGGCCAAA 


CTCATCTTTT 


CAGTTTGTGC 


TTGCAAGGGA ATACTAGGCT 


TTTCAGCTCC 


8400 


TAGAACCCAG AAACGTGTGA AATTGGCTTC CATTTCCTGA ATATCCTCGG CAATCAGTTC 


8460 


CAATCCATAT 


TCTTCAGCAG 


AACTTCTAGG 


TGCAACTGCT 


GCAAAGGGCT 


GGTCTGGATG 


8520 


TTCGGAAATA 


AAACGGGCCG 


CATAAGCTGT 


ACTAGCTGTT 


ACCTCGATTT 


GAGCCTCTGG 


8580 


ATATTGTTCA 


TCGATGAATT 


TCTTTCCTTG 


AGCCAAGGCC 


TGTGGATGTG 


AAAAAATCTT 


8640 


TTCAATCTTA 


GTATGGCCTG 


GAACCACCAT 


CAACTGCTGA 


TGAATAGGCT 


GAACGATTTC 


8700 


TGCTACTGCT 


TGGATGTGAG 


CCTGATGAAA 


AAGATAGTCC 


AAGGTTTCAT 


GAACACTACC 


8760 


CTCAATAGAA 


TTTTCAACTG 


GCACCACAGA 


ATAGTTCACT 


AATCCTTGCT 


CATAAGCCTT 


8820 


GATGACATCT 


GTAATGTTGG 


CAAAAGCCTG 


CAATTCCTCA 


TGAGGAAAAG 


CTGTCTGCAC 


8880 


AACGTGGTGT 


GAAAATGATC 


CCTTGGGACC 


TAGATAAGCA 


ATTTTCATCT 


TAGTTCCTCT 


8940 


ATAATTTCCT 


CTGGGCTTAG 


CTTGGTCACA 


TCCAAAACCC 


GACTAGCCAC 


TTCCTCATAC 


9000 


CAAGCCTGTC 


TTTCTTGGAA 


AATAGCTACT 


AGTTCTTCCT 


TGCTATTATT 


TAGAAAAAGC 


9060 


GGTCGCTGAT 


TGTCCTTATC 


AGCTGCGATA 


CGTTGGTAGA 


GGGTTTCAAA 


ATCTGCTCTC 


9120 


AGGTAGATGT 


TATCTGTATT 


AGTCTTGAGT 


AAGTCACGAT 


TTCTCTGAGA 


AATAACCACT 


9180 


CCTGCTCCAG 


TTGACACGAC 


TTGGTCTGTT 


TGTAGTAAAT 


CAGCTAGGAC 


TTCTGATTCT 


9240 


ACCTGACGAA AGGCTGTTTC TCCCTTTTCA GCGAAAAAAT TCGCAATGGA 


CATACCTAGG 


9300 


CGATTCTCAA 


TCAGAGCATC 


CATATCAAGG 


TAATTAGGGT 


CCAAGCCTCT 


TGCAATAGTC 


9360 


GATTTTCCAG 


CCCCCATAAA 


CCCTAATAAC 


ACCTTAGCCA 


TGAATCAAGC 


TCTCCAAATC 


9420 


ATCAAAGAAA 


CTAGGATAGC 


TGGTATTGAT 


GGCTTCTGCA 


CGGTCAAGCT 


CCACCTCTCC 


9480 


ATCTGCAACC 


AAGAGGGCTG 


CGATAGCTGT 


CATCATGCCG 


ATACGGTGGT 


CACCAAACGT 


9540 


ATTGACTCTA 


GCACCGTGAA 


GAGCTGATTT 


TCCTTTGATA 


ATCATCCCAT 


CTGCCGTAGG 


9600 
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AGTAATATCT 


GCTCCCATAC TATTTAAGGC GTCTGCCACA 


ACCTGAATAC 


GGTCTGTTTC 


9660 


CTTGACCTTG 


AGCTCCTCAG CATCCTTGAT AACTGTTACA 


CCTTGGGCTT 


GGGTCGCAAG 


9720 


CAGGGCAATA 


ATGGGCAATT CATCAATCAA TCGTGGAATC 


AAAGCGCCAC 


CAATCTCTGT 


9780 


TCCTTTCAAG 


TCAGAAGACT CAACAATCAA GGTAGCAGAT 


TTAGCGACTG 


GATCGATTTC 


9840 


AGTTATTTCC AATTTTCCAC CCATGGCACG AATGACATCA ATAATACCGG 


TGCGAGTTTC 


9900 


GTTGATCCCC 


ACATTCTGCA GCACTAGACG AGAATTTGGA GCAATCAAAC 


CTGCGACTAA 


9960 


CCAAAAGGCT 


GCACTGGAAA TATCTCCTGG TACGACCACC 


TTCTGTCCTG 


TCAATTTTTG 


10020 


TGGCCCCTGG 


ACTGTGATTT TCTTACCATC CACACTTAAA 


TGACCACCAA 


ATTGTTTCAA 


10080 


CATATCTTCA GTATGATTAC GGGTGTACTC TTTTTCGATA 


ATAACTGACT 


CCCCCTTAGC 


10140 


TTGTAAGGCT 


GCAAACATCA AGGCTGACTT GACTTGGGCA 


GAGGCAATTG 


GCAACTCATA 


10200 


ATGAATAGGT 


CTTAGGTTTT TCGTCCCTTT TAAGCGAAGG 


GGAGGCAAGT 


CTCGTTCAGT 


10260 


TTGCCCTGAA 


ATGCTGACGC CCATTTTTTT CAGTGGAAGG 


GTCACACGGT 


CCAT AGGACG 


10320 


TTTGGAAAGA 


CTATCATCTC CAAACATCTC TACTTCGAAA 


TCTGCACCAG 


CAAGGACACC 


10380 


TGAAATCAGG 


CGAATCGAGG TGCCAGAATT TCCCATATTA 


AGGGCATTTT 


GTGGCGCTTT 


10440 


TAAGCCAGCC 


ATGCCTACAC CTTGAATGGT AATAACCCCA 


TCTTTATCCT 


CAATTTCAAC 


10500 


ACCAAGGTCA 


CGAAAAACCT GCATGGTCGA AAGAACGTCT 


TCACCTCGCA 


GAATATCATA 


10560 


AACCTTGGTC 


TCACCCTCAG CCAAACTTCC AAAGATAATG 


GAACGGTGGC 


TGATAGACTT 


10620 


GTCACCTGGG 


ACGCGGATAC TACCATGTAA ATGGCGAATG 


TTTGTTTTTA 


GTTTCATACT 


10680 


GGACCTCATA 


CTTGCAATAC TTTTACCTAT TTTATCATAA 


AAAGCCAGAA 


ATTCCTTAAA 


10740 


AATTCCTGAC 


TTTAGGATCG TTCTTTTCTT ATTTCAGCAA 


TTCTGAAACT 


GGTTCAAAAA 


10800 


CAATTTTTTC AATATCAGAA AGGTAAATGG CCAATTGTTG 


TTGCTTGGTA 


AAGAATTCTG 


10860 


ACAAGAGGCT 


ATTTCCTTGA ATCTGTTTAC CAAAGCCTTC 


CATCTTAGCT 


TGGAAGGACG 


10920 


CATCTGGCAT 


TTGACCTGTC TGTGCTAGTT TTTGAATTTC 


CTCTTGAAAG 


GCAAGATAAT 


10980 


CTGTAAAGAT 


TTTGCTTGCC TCAGCATCTG CTGCAATCGC 


ATCTTTAGCT GCTTTAACAG 


11040 


CCTTGTATTC 


TGGTAATCCG CGTAGACCGC GACTGAGTTC 


GTTTGCACTA TCGTAAATAT 


11100 


TTGACATGTT CTTCTCCTTA TTTGATGACG ACTGTATAGT 


CAGTATTTTC 


TGTTATGAGA 


11160 


TGCTCAGCTC 


TTTCCAAGTC TTGAGCATTT TTAAATGAAA 


TTTGTAGGAT 


TCCGTGAATA 


11220 


TCCTCACGAT 


TTTCCTCGTT GATGTGGATA TTAACCAAGG 


AAGTTCCACG 


TAGCAGTTCC 


11280 


AAAATCCGCA 


GGATGACATC TTCTTCATCA GG AACGTCAA 


CATAGAGGTC 


GTAAGAGCTA 


11340 


TCCACACCAC 


CACGCTTATG GATTTCCATG GTCTGGCGTT GTTCACGCGC TTGGTTAAAA 


11400 
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AAGTTCCAAA TTTGCTCTTC ATCTCCCTTA 


CTAATGGCCT 


GACCAATCGC 


TTCCAAACGT 


11460 


TCCTTGAAAT CCTCAATTCT ATCCAGAATG 


ATCTCGCTAT 


TGGACAAGAG 


AATGGAGGTC 


11520 


CACATTCCTG GCTCGCTTTC CGCAATTCGG 


GTCATATCTC 


GAAAACCACC 


TGCCGCAAAG 


11580 


CGCCTTGCCA TCTCATGCTC TTGAGCATAG ACCGCAGTCT GCTCCATGAG ACTAGAAGCC 


11640 


AAAATATGAG GAAAATGGCT AATCTGAGAA 


GTGACACGAT 


CATGCTCCTT 


GGCATCAATC 


11700 


TCGATAAAAC GAGCATGAAG ACCTGAAAGC AGATCCTTCA TTTCCTTAAG CGTGTCCTGA 


11760 


CTTGTCAGGC TTGAAGGTGT AAAGATATAA 


TAGGCATTTT 


CAAAAAGATT 


GACATCTGCC 


' 11820 


GAAGCAGCCC CTGTCTTGTG ACTACCAGCC ATGGGATGGG CCCCGACAAA GCGAACAGAC 


11880 


TTGGCAGCCA AATACTGCTC CGCCGCATCC 


ACAATGGTTG 


ACTTGGTCGA 


ACCAGGATCT 


11940 


GAAATAATAA CGCCTTCTCG CAAATCCAAA TTGGCCAACT CCTTAATGAA AGCAATAGTT 


12000 


TGTTTGATTG GCAAGCTGAG GATAATGACA 


TCTGCCAAAG 


GAGCAAAACT AGCAAAATCA 


12060 


TCCGTTGCAC GGTCAATCAT ACCTTCTTTC 


AAGGCGATAT 


CTCTCGAAGC 


TTGACTACGA 


12120 


TTATAACCTA AAATTTCATA ATCTGGATGA 


TCGCGTTTGA 


TACCAAGTGC 


CATAGAGGCT 


12180 


CCAATCAACC CAAGACCTGC GATATAGATT 


GTTTTTGCCA 


TAGGAACTCC 


TTAATAGTTC 


12240 


TTTGTATAGT CTCGGTGTTT GGCTACCGCT 


TCTTTTAGTT 


CCTCAAGATT ATCTGATGAG 


12300 


AATTTTTCGA GGATTTCTTG CGCCAGAACC 


GTTGCTACAA 


CTGCTTCCAT 


GACCATTCCT 


12360 


GCAGCTGGAA GAGCAGTCGG ATCACTTCTC 


TCCACGGTTG 


CCTTGTAAGG 


TTCGTGGGTT 


12420 


TCGATATCCA CACTCATAAG AGGTTTATAA 


AGAGTAGGAA 


TGGGTTTCAT 


GACCCCAGGA 


12480 


ACAACGATGG GTTGCCCATT AGTCATACCA 


CCTTCAAAAC 


CACCTAGATT 


ATTGGTACGG 


12540 


CGAGTATAAC CGTCTTCTTT AGACCAGAGA 


ATTTCATCCA 


TAACTTGGCT 


GGCTTTAGGA 


12600 


TAACCAGCCT CAAAGCCAAG ACCAAATTCC 


ACCCCTTTAA 


AGGCATTGAT 


AGAGACAACA 


12660 


GCTTGAGCCA ATCTTGCATC CAATTTTCTA 


TCCCATTGGA 


CATAGGAACC 


AAGACCAACT 


12720 


GGAACGCCTC CGACGACTGT CTCCACAAGC 


CCACCGATGG 


TATCACCATC 


ACGTTTGATT 


12780 


TGGTCAATAT AGTCCTTGAT TTCCTGTTCT 


CGTTCTTGGT 


TGACAATAGA 


AACTTCAGAC 


12840 


TGGGCAGCTC TTTGCTTAAT TTCAGCGACT 


GTCAGATTTT 


CAGGAACATC 


GATTTCCTTG 


12900 


CCACCAAAGA CCACGACATG GTTGGCAATC 


TCCATATCCA 


GCTCAGCCAA 


GAGGCGTTTG 


12960 


GCTACTGCAC CAACTGCCAC CCGCATGGTG 


GTTTCACGAG 


CTGATGAACG 


CTCGAAAGAA 


13020 


TTTCGCAAAT CATCAAAACG GTACTTAATC 


CGCCCAACCA 


AATCGGCATG 


ACCTGGGCGA 


13080 


GGATGAGTAA TTTTCCGCTT GCTTTTAAGG 


CGGTCTTCAA 


TGTCCTCCGC 


AGACATGATG 


13140 
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TCCAGCCATT 


TCTGGTGGTC 


CTTATTGATG ACATCCATAG TAATAGGCGC 


CCCTGTCGTC 


13200 


TTCCCGTGGC 


GAACGCCCGA AGTAAAGACA ACCTGGTCAT TCTCAATCTT 


CATACGACCA 


1J«DU 


CCACGACCGT AGCCACCCTG ACGGCGTCTA AGGTCCTCAT TGATATCCTC 


AGCTGTCAAT 




GGAAGTCCAG 


CTGGAATTCC 


CTCAATAATA GCTGTTAGAC GGGGGCCGTG TGATTCTCCT 


Xj JoU 


GCAGTTAAAT 


ATCTCATACA 


CTCTCCTTAT TTTACCAAGT AGTCTTTCAT 


CTCTTCCAGA 


1 *5 a a n 
1 3440 


GAAACTGGGT 


GAATGGTCGC 


TGAACCAAGC TCTGGCACCA AGACCAATTT 


CAAGGTGTTA 


13500 


CCACGCGCTT 


TCTTGTCATG 


AGTAAGAGCC TGATAAAGCT TGCCAACTTC 


CCAATTTTCA 


13560 


TAGTCAACAG 


GCAAACCGAA 


TTTCTGACAC ATCTCTGTGA TAGATTGGGT 


AATGCCAGCT 


13620 


GGCATGAGGC 


CTTTTTCCTC 


AGCAACCTTG GAAATCTGTA CCATTCCCAT 


GGCAACAGCC 


13680 


TCTCCATGCA 


TGACCTTGCC 


ATAACCGGCA GTCGCTTCGA TGGCATGGCC 


AATAGTGTGG 


13740 


CCAAAATTGA GGTAAAGACG AATACCATTG TCCAACTCAT CTTCAACCAC 




13800 


TTCACCTGAC 


AAGAATGTTC 


AATCAAGGTC TCTGCATGTT CCAAAATACT 




13860 


CCATTCAGTC 


CCGTCAAGAG 


AGCCCACAGT TCTGGATCCT CAATCAAGCC 


ATACTTGATA 


13920 


ACTTCACCCA 


TCCCTTCAAT 


CAACTCTCTT TTTCCGAGGG TTTCAAGAAC 


AAGTGGATCA 


13980 


ATCAGAACCC 


CATCTGGTTG 


GGCAAAGGTC CCCACCATAT TTTTAGCAAA 


TGGTGTATTA 


14040 


ACGCCTGTCT 


TTCCACCGAT 


AGAAGAATCA ACCTGAGCTG TCAAACTAGT 


CGGAATCTGA 


14100 


ACAAAGTGAA 


TACCCCGCAT 


ATAGGTAGAG GCTACAAATC CAGCCAGGTC 


CCCAACAACG 


14160 


CCACCACCAA 


GAGCAACGAT 


TCCATCGCTA CGAGTCAGAC CTTGCTTGAC 


TAGAAATTCA 


14220 


TAGACTTTCT 


GAACAGTAGT 


TAAATTCTTT CTTTCTTCAC CTTCTAAGAA 


ATCAAAAACA 


14280 


GCTACCTGAA AACCAGCATC TTCTAGGCTG AGCTTGACCT TCTCTGCATA 


GAGAGAGGCT 


14340 


ACATGGTTAT 


CTGTCACAAT 


GACTACCTTT TGCGGTTGCC AGAGTTCTCG 


CAACCACTGA 


14400 


CCAGCCTGGG 


CCATACAACC 


TTTTTCAATC TGAATATCAT AAGGATGGTG 


AGGAATATGG 


14460 


ATTCTGATTT 


TCATAGGAGA 


GTCTCCCTTT CTTTATTGGT ATTTTTCTGT 


TAAAGACTGC 


14520 


CAAATCTCTT 


CTGTCGGCAT 


TTCCTTGCCT GTCCACAGTT GAAAAGCTTC 


TGCAGCTTGA 


14580 


TAGAGTAACA 


TTCCCAGACC 


ATTGACTGCT GGATTGCCCT GACTTCTAGC 


CCATTTCAAA 


14640 


AACGGTGTTT 


CAAAGGGTTG 


GTATATGATA TCTGCAACTA AAAGAGTTTC 


TGGTAAGACT 


14700 


ATGTTTTCAG 


GAACAGGAGA 


GGATTGGCCA TCCATGCCCA CACTGGTGGC 


ATTAACTAGC 


14760 


AAATCCGACT 


CGGCAATCCT 


TGCTTGCAGT TCAGAAACAT ATTCTAAAGC 


ACACAAATCC 


14820 


ACTTTAAAAC CTGTCTGCTC 


CTGTAACTTG TCTAGGTAAG GTCTTGTTTT TTCCATAGAA 


14880 


ACGGAACGAA 


CAAAGACCGA 


AATCTGACTG ACGCCATCCA AAATAGCCTG TGCCAAGATT 


14940 
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GATTTAGCCG 


CACCACCTGC 


ACCCAGCAGG 


GTCATCTTTT 


TACCTGAAAT 


TGTAAAAGAA 


15000 


GGCAAGCACT 


TAAAAAATCC 


CTTGCCATCT 


GTATTATATC 


CAATTAAATT 


GCCATTCTCA 


15060 


TTGACAACCG TATTAACCGC ACCAATCAAG CGCGCTTCAT CGCTCAGCTT ATCCAAATAA 


15120 


GGAATCACCT 


GCTCCTTATA 


GGGCATGGAC 


AGATTGATGC 


CAAACATCTG 


GTAGCGACGA 


15180 


ATATTGGCCA 


CTGTTTCTAC 


CAAGTCACTC 


GCTTCAATCT 


CCCAAGCCAC 


ATAAGCACCG 


15240 


TTGGTAGCTG 


TCGCCTCAAA 


GGCTCTATTG 


TGGATGAAGG 


GAGAAATAGA 


ATGCTTAATA 


15300 


GGATTGGCAA 


CAACTGCAGC 


TAAACGTGTA 


TAGCCATCAA 


GCTTCATCCA 


AAATCTCCCT 


15360 


GATTTTTTTC 


ATGCTAGCTA 


GAGAAATCTG 


CCCAGGGGCA 


CTAACCTCAT 


CCAGACTGGC 


15420 


AAAAGACCAA 


CTCGAACCAG 


TCACATCCGC 


AGTGATACGA 


GAGACCTTGC 


CCACCTTACG 


15480 


CATAGAAATG 


GTCACATATT 


CCTGTTCAGG 


ATTGAGGGTT 


TTAAAGCCTC 


GTGTATAGTT 


15540 


CATCAAGTCT 


AAGACATCCT 


GCTCCGTGTG 


AGCCATCACC 


GCAACCTTAA 


CAAGTTTTGG 


15600 


ATTTAGGATC 


GTCAACTCTG 


ACAAGATTTC 


CATCATGTTG 


TCAGGTGTTT 


CTTGGAAATT 


15660 


ATGGTAACTC 


AAAACAAGAT 


TTGGGAAGTC 


CAGCATTTCC 


TCAAAAACAT 


CCTTGTAGCT 


15720 


ATAGTACTCA AAATCAATAT 


AGTCTGGTTG 


ATAGAGTTGC 


GCAACTTCCT 


TGATTAGATG 


15780 


GATATACTCT 


TCTGGAGAAA 


GGTCGATTTC 


TCCACCTTCG 


GAGCGAGTTC 


GTAGCGTGAA 


15840 


AACCAACTCA CGGCCTGCGA ATTTTTCAAA AATGGCTGGA GCTACCTGCA AAATCGCTTC 


15900 


TTTAGGCAGA 


TAGTCGGCAC 


GCCATTCAAT 


GATGTCGGCA 


TCCAGGTACC 


TCGTGGCATC 


15960 


CAGAGCCTGA GCCTCCTCTA AACTTCTTGG CATTACTGAA 


ACGATTAATT 


TCATTTACTA 


16020 


ACCTTCATAC 


TAATCACCTT 


GAGGTAATTA 


CTACTTTCAT 


CTTTTTTATT 


ATAGGCAAAA 


16080 


TCTGCTGGAA 


GACCATATTT 


GTTTAAAATC 


TGGTAACTTC 


TTCCTGCAAA 


ACCTTTATCA 


16140 


ATTTGTTCTG 


TAAATTTCTG 


ACGGGAAACA 


TTGGCAGCAT 


TGGTACTGGC 


AATGATAATC 


16200 


CCTCCCGGAT 


TTAAAATCTC 


AAGACTCTGG 


GAAATCAACT 


TGTGATAATC 


CTTGGCCACA 


16260 


GAGAAAGTTT 


GTTTTTTATT 


CCGAGCAAAG 


CTAGGCGGAT 


CTAGGACAAT 


CACATCGTAG 


16320 


GTCAAGTCTT 


TGCGTTTGGC 


ATATTTGAAA 


TACTCAAAGA 


CATCCATGAC 


TATAAAACGA 


16380 


TGCTCGTCTG 


TGCTGAGCCC 


ATTTGCCTGA 


AAATGCGCTT 


GAGACAATTC 


TCGTGAACGT 


16440 


TTGGCTAGAT 


CAACAGAAGT 


TGTATGGCTA 


GCTCCTCCCA 


TGGCCGCAGC 


TACTGAAAAA 


16500 


GCCGCTGTGT 


AGGAAAACAT 


ATTGAGTAAG 


GATTTACCCA 


TAGCCAAGCC 


GTCAACTAAA 


16560 


CTACCGCGAA 


CCTCATGCTG 


GTCTAGGAAA 


ATTCCTGTCA 


TCAAGCCATC 


ATTCATAAAG 


16620 


ACTTGATACA 


GGACACCATT 


TTCTAAAACA 


TTGAAAAAGT 


CAGGTGCTTC 


TTGACCATAA 


16680 
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ACATGGGCAG ATTCATAGTC CAAACCCTTA AAGCGGATTT TCTCATAAGC TCCTAAAACC 16740 

TCAGGGAAAA CCTGTCTAAA GGCTTCTGAT ATAGTCTGAC GAATCTGATA AACATAAGAG 16800 

TTATACCAAG AAAAGACGGC GTAGTCGCCA TAAAGGTCCA CTGTCAGACC CCCAAAGCCA 16860 

TCTCCCTCTT GATTAAAGAG ACGAAAGGCA GTTGTCAAAT CATCTTGATA GTAGGCGTTT 16920 

CTCTTTTCTT TGGCTTTTCT AAACAACGTT TCAAAGAAAG CTTGATTGAA GGCCACCTTG 16980 

TCTTTGCTGA TAAACCAGCC CAAGCCCTTG TTTTGCTGAG AAAGGTAGGC AGTCCCAAGA 17040 

AAGTTTCCTT CCTGACCCTG CACCTCTACT TCCTGATCCT TAAGATTGAC ATTCTCAAGA 17100 

TCACTGGCTT CTAGTAAAAC TAGCCCCTTA GCAAGCTTCT TTTCAACCCT TTTGCTGACT 17160 

CTTATTCTAT TCATAACTAC CATTATATCA AACTTTTAGA CAATTCTCAA AAAAGAAACT 17220 

ACCCTTGCTT TTTTACTCTT CTTTTAAAAA ATGGTATACT AGACTTCCTG CAAAACTAGG 17280 

AAGTAAATGT GTAAGAATCA CAGTAAAAAA TGCTCTTCCG TCTTGGAGGA GCATTTCTTT 17340 

TTATCAACGA AAATCAAATA GCAAACTATG AAACTAGCCT CAGGTTAACT GTGAGATTAT 17400 

AGGTAGAGAG GTTGTATCAG CAATATGTGT CTGTCAAATT TAGTGACAAA GGTAGTAGAA 17460 

GAAAGATAAA GAAATAAATC AGCTTCAGTA GGTATCTGGA AAATTTGATT TTATAGAGAA 17520 

GCCTTTTGTT ACAAACTCAA TATACTATCA ATAAATAATA TTATAGAAGC AACAATAATT 17580 

ATAATTTCAC CTATCTGCAT CATTCTATTT CGAACTCTAA ATATATGTTC TATCAAAAAT 17640 

ACTTGGAACA CACACATTAT AGGAATTAAC GTTTTTGAAA TTGAAAAATA TCCAAATAAA 17700 

TAAACTATAA ACAACAAAAA TAGAACTATG TTATATTTCT TATTCAAAAC ATTCCTCCCT 17760 

ATATATTTTT GATTACCAAT CTTAATCATT TACAACTACA TTCTAACAAA CTATAAAAGC 17820 

GTTTGTCGAA TTGAATTTAT CAAGCAAGCG ACCAACCAGT TCATCTTTTT TCTATTTCTG 17880 

CCAATATGCG TGACAGGTAA TAATGATAGC CAAAAATAGC AAGAGCAAGC AAGACGATAA 17940 

GAGCTCCTAC TCCCAAGCTG ATGGCAAGGA TAGGGGAGAG AGACTGAACC AAGAATATGC 18000 

TCCCAATTAC AAGGGCCATC AGGATTGCAC TATAAATAAA CAATAAAACT ATGGCGACTA 18060 

TGCCATTTGA ACGATTCACC AGGTCCGTAA TGCTACTCCA ATTGGTTGAC AGATTTTTAA 18120 

CGTCCTTAAA GTAATGGTGG CAAGAAAGGA TGACACTGGC AATGATCCAG ACTACAAGAA 18180 

GGTAAATCAT CGAAATGATG GGCAAGCCTA GATATAGAGA AAGACCAAGC AAAGTCAGAA 18240 

CTGGTAAAAA GGACTGGACA GCATATATAA TCCAAAATTT CACTTTCACA TAACGAGCAA 18300 

AGTCAAAGGG TAAACTCTTA AGAAAATCAA CATTTTCCCT CTCCAAGGAC AAGGCAATTG 18360 

AATGCAGGCT GGTGATATTG TTATTGACAA CTGCTATAAA GAGAGCTATA AAAAACAAGG 18420 

GTAACCAGTA TGGAGGATGA ATGTCTGGAA CTATCTGAGA ATCTCGGATT TTGGAAATCA 18480 
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GACCGATCAT CATGAGATAA GGAAGGAAAG CACTTGTAAA AAGCACTGTA ATCACGCCAG 18540 

TCCCCTGTCC CAAGAGGGTG AGGTGGTAGC GTAAAACCAT GCGAAAAAAT CCCTTTTTAG 18600 

TGGTTGAAAT TCTCTCCTTG CTGCGACGTT CTTTTTTGAC CTTCTCCTCA CTATTAAGCA 18660 

GGATCACGTC ATAAAAACGA GGAAGGACCT TCTTTTTGGT CAGATAAAGC AGGAAGAGAG 18720 

TTAGTCCTAT CCAAGCGAGC AGACCCACTA AGGCTTCTGT CGAAAAAGGC TCCACTGCTA 18780 

TTTTGTAAAA GATATGAAGA GGATAAAGGA GAAATGGAAT GTCTCTAACT TTGTCAACAA 18840 

TACTTCCAAA AGTCGACTGA AGAAAGAAGA TAAATATTAA AGGTATGAGA ACTCCTATCC 18900 

CAATCATCAC ATTCGAAAAA ATAGACTGAT ACTTTCTGAA GACCCTAGTT TGAGCCAAGA 18960 

AATGCACTGC \CACTACCATC ACTAGAGCCA CAGAGACAAA TAATAAGGTC AAGGACAGTA 19020 

GCATCAAAGG CAAACCCAGC CATAGAGAAG GAGCTAGCCT AATGTAGAGG ACCAGAAAAT 19080 

AAGCTAGGAT TGGTACAATT CCAGTTAGAG CTGGCAAAAG GACAGACAGT CCTTTAGGAA 19140 

TTATAATCTC TGATTCTTTA AAGGCATAGG GCCTATACGA TACCAAATCG TTAGTCTCAT 19200 

AAAAGACATT GTAAAAGGCC GTTAAAGAAG TTGAAAAGGC AATCACTAGT AAAATAGCAA 19260 

TCATCGAGCT AAAATAAATA GGTATTTCCT CAAAAGGAAA ATGAATGGCT ATATTACTAA 19320 

AACAGATGAT CATCAAGAGA CTGGAAAAAA TGTAAGAACT TAAGACTCTA GCGGAAACAT 19380 

TTACTTTTTT 19390 

(2) INFORMATION FOR SEQ ID NO: 87: 

(ij SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 18436 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

CCGAGCGTCG TTACAGACTT TATCAAGATT GGACGCAAGA AGAAATTCAA CATATAAAGG 60 

AAAATATGGC ACAATCTCCA TGGCATACTC ATTACCATGT TGAGCCAAAA ACAGGACTTC 120 

TCAACGACCC AAATGGCTTT TCTTACTTTG ATGGCAAGTG GATCCTCTTT TACCAGAATT 180 

TTCCTTTTGG TGCAGCCCAC GGTTTAAAAT CTTGGGCACA GCTAGAAAGT GATGATTTGA 240 

TTCACTTTAA AGAAACTGGA ATCAAAGTTT TACCAGATAC TCCATTAGAT AGCCACGGTG 300 

CCTACTCTGG TTCTGCCATG CAATTTGGCG ATAACTTATT CCTATTTTAT ACAGGAAATG 360 

TTCGCGATAA AAACTGGATC CGTCACCCAT ACCAGATCGG TGCTTTGATG GACAAGGAGG 420 
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GTAAGATTAC AAAGATTGAC 


AAGATCTTGA 


TTGACCAGCC 


AGCAGACTCT 


ACTGACCACT 


.480 


TCCGCGATCC ACAAATTTTT 


AACTTTCAGG 


GTCAATATTA 


TGCCATTGTC 


GGCGGACAAG 


540 


ACTTGGAGAA AAAAGGTTTC 


GTTCGTCTCT 


ACAAGGCTGT 


CAATAACGAC 


TACACAAACT 


600 


GGCAAGCAGT TGGCGACCTT 


GACTTTGCTA 


ACGACCGTAC 


TGCCTACATG 


ATGGAATGTC 


660 


CTAATTTGGT CTTTGTAGAG 


GAACAACCTG 


TCCTTCTCTA 


CTGTCCACAA 


GGATTGGATA 


720 


AGAAAGTTCT AGACTACGAT 


AATATCTTTC 


CAAATATGTA 


TAAGATCGGG 


GCTTCCTTTG 


/(JU 


ACCCTAAAAA TGCCAAAATG 


GTAGATGTGT 


CTCAACTTCA 


AAACATGGAT 


TACGGTTTCG 


Oil A 

t)4U 


AAGCCTATGC AACTCAAGCC 


TTCAACGCTC 


CTGATGGGCG 


TGCTCTAGCA 


GTTAGCTGGC 


you 


TTGGTTTGCC AGATGTTTCT TACCCATCTG ACCGTTTTGA CCACCAAGGA ACCTTCTCTT 


960 


TGGTCAAGGA ACTCACTATC 


AAAGACGACA 


AGCTCTACCA 


GTATCCAGTC 


GCTGCTATTA 


1020 


AGGACCTTCG TGCTTCTGAA 


GAAGCCTTCT 


CAAACCGTTC 


CCAAACCAAG 


AACACTTACG 


1080 


AACTTGAACT CAACTTGGAA 


GCTAATAGCC 


AGAGCGAGAT 


TGTCTTACTT 


GCTGATAAAG 


1140 


AAGGTAAGGG ACTTTCAATC 


AACTTTGACC 


TTGTAAACGG 


TCAAGTAACA 


GTGGATCGTA 


1200 


GCCAGGCTGG AGAACAGTAT 


GCCCAAGAAT 


TTGGGACAAC 


TCGTTCTTGC 


CCTATCGAGA 


1260 


ATCAGGCTAC TACTGCTACA 


ATCTTCATCG 


ATAACTCTGT 


CTTTGAAATT 


TTCATCAATA 


1320 


AAGGAGAAAA AGTATTTTCT GGTCGTGTCT TCCCACATGC GGACCAAAAT GGTATCCTGA 


1380 


TTAAATCTGG AAACCCAACT 


GGAACTTACT 


ATGAATTAGA 


TTATGGTCGC 


AAAACTAACT 


1440 


GATGTCGCCA AACTTGCAGG 


CGTCAGTCCT 


ACTACCGTTT 


CTCGGGTTAT 


CAATAAAAAA 


1 CAft 

lbOO 


GGGTATCTAT CTGAGAAAAC 


CATCCAAAAA 


GTCAATGAAG 


CCATGCGAGA 


ATTGGGCTAT 


1560 


AAACCCAACA ACCTGGCTCG 


TAGTCTGCAA 


GGAAAATCAG 


CTAAGTTAAT 


CGGCTTGATT 




TTCCCCAATA TTTCCAATGT 


TTTCTATGCA 


GAATTGATTG 


ATAAATTGGA 


ACACCAACTC 


1DOU 


TTCAAAAATG GTTACAAGAC 


CATCATCTGC 


AACAGTGAAC 


ATGATTCTGA 


GAAGGAACGC 


1740 


GAATACATCG AAATGTTGGA 


AGC CAATC AG 


GTGGACGGCA 


TCATTTCTGG 


TAGTCACAAC 


1800 


CTAGGAATCG AAGACTACAA 


TCGTGTGACA 


GCGCCGATTA 


TTTCCTTTGA 


CCGAAACCTA 


. 1860 


TCGCCAGACA TCCCTGTCGT 


CTCCTCTGAC 


AACTATGCTG 


GTGGGGTTCT 


TGCTGCCCAA 


1920 


ACCTTGGTCA AGACAGGTGC 


CCAGTCTATC 


ATCATGATTA 


CAGGGAATGA 


CAATTCTAAT 


1980 


TCGCCAACCG GACTGCGCCA 


CGCTGGTTTT 


GCATCCGTAC 


TCCCAAAAGC 


TCCTATTATC 


2040 


AATGTTTCCA GTGACTTTTC 


TCCCGTCAGA 


AAAGAAATGG 


AAATCAAGAA 


TATCTTGACC 


2100 


CGGGAAAAAC CAGATGCCAT 


TTTTGCTTCG 


GATGATTTGA 


CAGCTATTCT 


GGTCATTAAA 


2160 


ATCGCTCAAG AATTGGGCAT 


TTCTGTCCCA 


AAAGAGCTCA 


AGGTCATCGG 


CTATGATGGG 


2220 



